Planet MySQL

Displaying posts with tag: Impala (reset)

Jun

2014

Using InfiniDB MySQL server with Hadoop cluster for data analytics

Posted by Alexander Rubin of MySQL Performance Blog on Mon 02 Jun 2014 16:58 UTC
Tags:

hadoop, infinidb, MySQL, Columnar Storage, Impala, Data Analytics, mysql analytical queries

In my previous post about Hadoop and Impala I benchmarked performance of analytical queries in Impala.

This time I’ve tried InfiniDB for Hadoop (open-source version) on the modern hardware with an 8-node Hadoop cluster. One of the main advantages (at least for me) of InifiniDB for Hadoop is that it stores the data inside the Hadoop cluster but uses the MySQL server to execute queries. This allows for an easy “migration” of existing analytical tools. The results are quite interesting and promising.

Quick How-To

The InfiniDB documentation is not very clear on step-by-step instructions so I’ve created this quick guide:

Install Hadoop cluster (minimum …

[Read more]

Apr

2014

Using Apache Hadoop and Impala together with MySQL for data analysis

Posted by Alexander Rubin of MySQL Performance Blog on Mon 21 Apr 2014 13:43 UTC
Tags:

scalability, hadoop, Hive, MySQL, Performance, Impala, Data Science

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how to export a table from MySQL to Hadoop, load the data to Cloudera Impala (columnar format) and run a reporting on top of that. For the examples below I will use the “ontime flight performance” data from my previous post (Increasing MySQL performance with parallel query execution). I’ve used the Cloudera Manager v.4 to install Apache Hadoop and Impala. For this test …

[Read more]

Sep

2013

MySQL webinar: ‘Introduction to open source column stores’

Posted by Justin Swanhart of MySQL Performance Blog on Thu 12 Sep 2013 21:39 UTC
Tags:

olap, analytics, hadoop, Infobright, luciddb, MonetDB, column stores, Justin Swanhart, MySQL, Impala, MySQL Webinars

Join me Wednesday, September 18 at 10 a.m. PDT for an hour-long webinar where I will introduce the basic concepts behind column store technology. The webinar’s title is: “Introduction to open source column stores.”

What will be discussed?

This webinar will talk about Infobright, LucidDB, MonetDB, Hadoop (Impala) and other column stores

I will compare features between major column stores (both open and closed source).
Some benchmarks will be used to demonstrate the basic performance characteristics of the open source column stores.
There will be a question and answer session to ask me anything you like about column stores (you can also ask in the …

[Read more]

Apr

2013

Deploying Cloudera Impala on EC2 with Example Live Demo

Posted by Marco tusa (Pythian) on Wed 03 Apr 2013 16:48 UTC
Tags:

ec2, hadoop, big data, Performance, Impala

A little while ago I blogged about (and open sourced) an Impala-powered soccer visualization demo, designed to demonstrate just how responsive Impala queries can be. Since not everyone has the time or resources to run the project themselves, we’ve decided to host it ourselves on an EC2 instance. You can try the visualization; we’ve also opened up the Impala web interface, where you can see query profiles and performance numbers, and Hue (username and password are both ‘test’), where you can run your own queries on the dataset.

Deploying Impala on EC2

While there are …

[Read more]

Jan

2013

The Data Day, Two days: January 15/16 2013

Posted by Matt Aslett on Wed 16 Jan 2013 19:38 UTC
Tags:

Oracle, Uncategorized, Tokutek, Hive, datastax, MySQL, HortonWorks, NuoDB, Clustrix, Impala, ObjectRocket, Ayasdi, Lattice Engines, zettaset

Funding for Ayasdi and Zettaset. NuoDB launches cloud database. And more

For 451 Research clients: NuoDB launches distributed ‘cloud data management system’ bit.ly/UO3ssM

— Matt Aslett (@maslett) January 15, 2013

For 451 clients: Armed with $20m series C, Lattice Engines looks to bring sales intelligence inside bit.ly/11z4VdF By Krishna Roy

— Matt Aslett (@maslett) January 16, 2013

Ayasdi Launches with $10 Million from Khosla Ventures and FLOODGATE. bit.ly/X7oemJ

— Matt Aslett (@maslett) …

[Read more]

Nov

2012

Typical “Big” Data Architecture

Posted by Venu Anuganti on Fri 30 Nov 2012 22:15 UTC
Tags:

postgresql, sql, database, scalability, ETL, hadoop, data warehouse, MapReduce, hbase, reporting, cloudera, NoSQL, vertica, Hive, bigdata, MySQL, SAS, Big Data Architecture, Big Data Warehouse, Data Architecture, Impala, NoSQL and BigData, Data Analytics, Data Science, kognitio, druid

Here is the typical “Big” data architecture, that covers most components involved in the data pipeline. More or less, we have the same architecture in production in number of places[...]

Top Authors

Oracle MySQL Blogs

Team Blogs

Vendor Blogs

Search

MySQL Links