Planet MySQL

Displaying posts with tag: Columnar Storage (reset)

Jun

2014

Using InfiniDB MySQL server with Hadoop cluster for data analytics

Posted by Alexander Rubin of MySQL Performance Blog on Mon 02 Jun 2014 16:58 UTC
Tags:

hadoop, infinidb, MySQL, Columnar Storage, Impala, Data Analytics, mysql analytical queries

In my previous post about Hadoop and Impala I benchmarked performance of analytical queries in Impala.

This time I’ve tried InfiniDB for Hadoop (open-source version) on the modern hardware with an 8-node Hadoop cluster. One of the main advantages (at least for me) of InifiniDB for Hadoop is that it stores the data inside the Hadoop cluster but uses the MySQL server to execute queries. This allows for an easy “migration” of existing analytical tools. The results are quite interesting and promising.

Quick How-To

The InfiniDB documentation is not very clear on step-by-step instructions so I’ve created this quick guide:

Install Hadoop cluster (minimum …

[Read more]

Aug

2012

Facebook makes big data look... big!

Posted by Doron Levari on Fri 31 Aug 2012 16:41 UTC
Tags:

database, scalability, sharding, facebook, big data, scale out, database scalability, Columnar Storage, Database Grid

Oh I love these things: http://techcrunch.com/2012/08/22/how-big-is-facebooks-data-2-5-billion-pieces-of-content-and-500-terabytes-ingested-every-day/

Every day there are 2.5B content items shares, and 2.7B "Like"s. I care less about GiGo content itself, but metadata, connections, relations are kept transactionally in a relational database. The above 2 use-cases generate 5.2B transactions on the database, and since there are only 86400 seconds a day, we get over 60000 write transactions per second on the database, from these 2 use-cases alone, not to mention all other use-cases, such as new profiles, emails, queries...

And what's the size of new data, on top of all the existing …

[Read more]

May

2012

Scale differences between OLTP and Analytics

Posted by Doron Levari on Tue 15 May 2012 04:08 UTC
Tags:

oltp, analytics, data warehouse, parallelism, scale out, database scalability, MySQL, Columnar Storage

In my previous post,http://database-scalability.blogspot.com/2012/05/oltp-vs-analytics.html, I reviewed the differences between OLTP and Analytics databases.

Scale challenges are different between those 2 worlds of databases.

Scale challenges in the Analytics world are with the growing amounts of data. Most solutions have been leveraging those 3 main aspects: Columnar storage, RAM and parallelism.
Columnar storage makes scans and data filtering more precise and focused. After that – it all goes down to the I/O - the faster the I/O is, the faster the query will finish and bring results. Faster disks and also SSD can play good role, but above all: RAM! …

[Read more]

Top Authors

Oracle MySQL Blogs

Vendor Blogs

MySQL Links