Planet MySQL Planet MySQL: Meta Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 10 of 153 10 Older Entries

Displaying posts with tag: hadoop (reset)

Rosetta Stone: MySQL, Pig and Spark (Basics)
+0 Vote Up -0Vote Down

In a world where new data processing languages appear every day, it can be helpful to have tutorials explaining language characteristics in detail from the ground up.  This blog post is not such a tutorial.   It also isn’t a tutorial on getting started with MySQL or Hadoop, nor is it a list of best practices for the various languages I’ll reference here – there are bound to be better ways to accomplish certain tasks, and where a choice was required, I’ve emphasized clarity and readability over …

  [Read more...]
A Grand Tour of Big Data. Interview with Alan Morrison
+0 Vote Up -0Vote Down

“Leading enterprises have a firm grasp of the technology edge that’s relevant to them. Better data analysis and disambiguation through semantics is central to how they gain competitive advantage today.”–Alan Morrison.

I have interviewed Alan Morrison, senior research fellow at PwC, Center for Technology and Innovation.
Main topic of the interview is how the Big Data market is evolving.

RVZ

Q1. How do you see the Big Data market evolving? 



  [Read more...]
How to Deploy a Cluster
+0 Vote Up -0Vote Down

 

In this blog post I will talk about how to deploy a cluster, the methods I tried and my solution to resolving the prerequisites problem.

I’m fairly new to the big data field. Learning about Hadoop, I kept hearing the term “clusters”, deploying a cluster, and installing some services on namenode, some on datanode and so on. I also heard about Cloudera manager which helps me to deploy services on my cluster, so I set up a VM and followed several tutorials including the Cloudera documentation to install cloudera manager. However, every time I reached the “cluster installation” step my installation failed. I later found out that …

  [Read more...]
Using Apache Spark and MySQL for Data Analysis
+0 Vote Up -0Vote Down

What is Spark

Apache Spark is a cluster computing framework, similar to Apache Hadoop. Wikipedia has a great description of it:

Apache Spark is an open source cluster computing framework originally developed in the AMPLab at University of California, Berkeley but was later donated to the Apache Software Foundation where it remains today. In contrast to Hadoop’s two-stage disk-based MapReduce paradigm, Spark’s multi-stage in-memory primitives provides performance up to 100 times faster …

  [Read more...]
Log Buffer #443: A Carnival of the Vanities for DBAs
+0 Vote Up -0Vote Down

This Log Buffer Edition finds and publishes blog posts from Oracle, SQL Server and MySQL.

Oracle:

  • SCRIPT execution errors when creating a DBaaS instance with local and cloud backups.
  • Designing Surrogate vs Natural Keys with Oracle SQL Developer.
  [Read more...]
Replication in real-time from Oracle and MySQL into data warehouses and analytics
+0 Vote Up -0Vote Down

Analyzing transactional data is becoming increasingly common, especially as the data sizes and complexity increase and transactional stores are no longer to keep pace with the ever-increasing storage. Although there are many techniques available for loading data, getting effective data in real-time into your data warehouse store is a more difficult problem. VMware Continuent provides

What’s the latest with Hadoop
+0 Vote Up -0Vote Down

The Big Data explosion in recent years has created a vast number of new technologies in the area of data processing, storage, and management. One of the biggest names to appear on the scene is Hadoop. In case you need a quick review, Hadoop is a Big Data storage system that takes in large amounts of data from servers and breaks it into smaller, manageable chunks. The technology is complex but at a high level the Hadoop ecosystem essentially takes a “divide and conquer” approach to processing Big Data instead of processing data in tables, as in a relational database like …

  [Read more...]
2015: More innovation, but still a year of transition
+0 Vote Up -3Vote Down

First things first: I could use this title for every year, it is an evergreen. In order for this title to make sense, there must be a specific context and in this case the context is Big Data. We have seen new ideas and many announcements in 2014, and in 2015 those ideas will shape up and early versions of innovative products will start flourishing.Like many other people, I prepared some comments and opinions to post back in early January then, soon after the season’s break, I started flying around the world and the daily routine kept me away from the blog for some time. So, as a good last blogger, it may be time for me to post my own predictions, …

  [Read more...]
On Hadoop RDBMS. Interview with Monte Zweben.
+0 Vote Up -0Vote Down

“HBase and Hadoop are the only technologies proven to scale to dozens of petabytes on commodity servers, currently being used by companies such as Facebook, Twitter, Adobe and Salesforce.com.”–Monte Zweben.

Is it possible to turn Hadoop into a RDBMS? On this topic, I have interviewed Monte Zweben, Co-Founder and Chief Executive Officer of Splice Machine.

RVZ

Q1. What are the main challenges of applications and operational analytics that support real-time, interactive queries on data updated in real-time for Big Data?

…  [Read more...]
Exorcising the CAP Demon
+0 Vote Up -0Vote Down

Computer science is like an enormous tool box you can rummage through whenever you have a problem to solve. Most of the tools are sturdy and practical, like algorithms for B-trees. Some are also elegant, like consistent hashing in Dynamo. Finally there are some tools that you never quite figure out even after years of reflection. That piece of steel you are looking at could be Excalibur. Or it could be a rusty knife.

The CAP theorem falls into the last category, at least for me.  It was a major topic in the blogosphere a few years ago and Google Trends shows …

  [Read more...]
Showing entries 1 to 10 of 153 10 Older Entries

Planet MySQL © 1995, 2016, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.