We use both MySQL and Hadoop a lot. If you utilize each system to its strengths then this is a powerful combination. One problem we are constantly facing is to make data extracted from our Hadoop cluster available in MySQL.
The problem
Look at this simple example: Let’s say we have a table
customer
:
CREATE TABLE customer {
id UNSIGNED INT NOT NULL,
firstname VARCHAR(100) NOT NULL,
lastname VARCHAR(100) NOT NULL,
city VARCHAR(100) NOT NULL,
PRIMARY KEY(id)
}
In addition to that we store orders customers made in Hadoop. An
order includes: customerId, date, itemId, price
.
Note that these structures serve as a very simplified example.
Let’s say we want to find the first 50 customers, that placed at least one order sorted by firstname ascending. If both tables …
[Read more]