Dolphin and Elephant: an Introduction
This post is intended for MySQL DBAs or Sysadmins who need to
start using Apache Hadoop and want to integrate those 2
solutions. In this post I will cover some basic information about
the Hadoop, focusing on Hive as well as MySQL and Hadoop/Hive
integration.
First of all, if you were dealing with MySQL or any other
relational database most of your professional life (like I was),
Hadoop may look different. Very different. Apparently, Hadoop is
the opposite to any relational database. Unlike the database
where we have a set of tables and indexes, Hadoop works with a
set of text files. And… there are no indexes at all. And yes,
this may be shocking, but all scans are sequential (full “table”
scans in MySQL terms).
So, when does Hadoop makes sense?
First, Hadoop is great if you need to …
[Read more]