Planet MySQL

Displaying posts with tag: analytics (reset)

Nov

2014

On Hadoop RDBMS. Interview with Monte Zweben.

Posted by Roberto V. Zicari on Sun 02 Nov 2014 18:15 UTC
Tags:

Java, Uncategorized, sql, RDBMS, analytics, hadoop, MapReduce, big data, NoSQL, nosql databases, relational databases, key value store, Monte Zweben, Splice Machine

“HBase and Hadoop are the only technologies proven to scale to dozens of petabytes on commodity servers, currently being used by companies such as Facebook, Twitter, Adobe and Salesforce.com.”–Monte Zweben.

Is it possible to turn Hadoop into a RDBMS? On this topic, I have interviewed Monte Zweben, Co-Founder and Chief Executive Officer of Splice Machine.

RVZ

Q1. What are the main challenges of applications and operational analytics that support real-time, interactive queries on data updated in real-time for Big Data?

Monte Zweben: Let’s break down “real-time, interactive queries on data updated in real-time for Big Data”. “Real-time, interactive queries” means that results need to be returned in milliseconds to a few seconds.
For “Data updated in real-time” to happen, …

[Read more]

Oct

2014

Data Warehouse in the Cloud - How to Upload MySQL data into Amazon Redshift for reporting and analytics

Posted by Severalnines on Mon 27 Oct 2014 14:23 UTC
Tags:

Other, scaling, Migration, analytics, data warehouse, reporting, big data, mariadb, galera, MySQL, redshift

October 27, 2014 By Severalnines

The term data warehousing often brings to mind things like large complex projects, big businesses, proprietary hardware and expensive software licenses. With Hadoop came open source data analysis software that ran on commodity hardware, this helped address at least some of the cost aspects. We had previously blogged about MongoDB and MySQL to Hadoop. But setting up and maintaining a Hadoop infrastructure might still be out of reach for small businesses or small projects with limited budgets. Well, perhaps then you might want to have a look at Redshift.

…

[Read more]

Aug

2014

Resources for Database Clusters: Performance Tuning for HAProxy, Support for MariaDB 10, Technical Blogs & More

Posted by Severalnines on Thu 28 Aug 2014 07:28 UTC
Tags:

Tools, Other, ha, Nginx, High Availability, webinar, ETL, analytics, hadoop, performance tuning, big data, mariadb, mongodb, haproxy, MySQL, clustercontrol

August 28, 2014 By Severalnines Check Out Our Latest Resources for MySQL, MariaDB & MongoDB Clusters

Here is a summary of resources & tools that we’ve made available to you in the past weeks. If you have any questions on these, feel free to contact us!

New Technical Webinars

Performance Tuning of HAProxy for Database Load Balancing

09 September 2014 - with Baptiste Assmann of HAProxy Technologies

Do you know what HAProxy can tell you about your application and database instances? Do you know the difference between …

[Read more]

Jul

2014

Making Real-Time Analytics a Reality — TDWI -The Data Warehousing Institute

Posted by MC Brown on Tue 15 Jul 2014 13:52 UTC
Tags:

Oracle, Articles, Databases, analytics, hadoop, data migration, big data, MySQL

My article on how to make the real-time processing of information from traditional transactional stores into Hadoop a reality has been published over at TDWI:

Making Real-Time Analytics a Reality — TDWI -The Data Warehousing Institute.

Filed under: Articles Tagged: analytics, big data, data migration, databases, hadoop, mysql, …

[Read more]

Jun

2014

Big Data Integration & ETL - Moving Live Clickstream Data from MongoDB to Hadoop for Analytics

Posted by Severalnines on Mon 16 Jun 2014 08:15 UTC
Tags:

Other, Data Integration, ETL, Migration, analytics, hadoop, talend, data migration, big data, mongodb, MySQL, hdfs, tokumx, clickstream

June 16, 2014 By Severalnines

MongoDB is great at storing clickstream data, but using it to analyze millions of documents can be challenging. Hadoop provides a way of processing and analyzing data at large scale. Since it is a parallel system, workloads can be split on multiple nodes and computations on large datasets can be done in relatively short timeframes. MongoDB data can be moved into Hadoop using ETL tools like Talend or Pentaho Data Integration (Kettle).

In this blog, we’ll show you how to integrate your MongoDB and Hadoop datastores using Talend. We have a MongoDB database collecting clickstream data from several websites. We’ll create a job in Talend to extract the documents from MongoDB, transform and then load them into HDFS. We will also show you how to schedule this job to be executed every 5 minutes.

Test Case

We have an application …

[Read more]

May

2014

Archival and Analytics - Importing MySQL data into Hadoop Cluster using Sqoop

Posted by Severalnines on Fri 16 May 2014 04:46 UTC
Tags:

Other, analytics, hadoop, mariadb, sqoop, galera, MySQL, archival

May 16, 2014 By Severalnines

We won’t bore you with buzzwords like volume, velocity and variety. This post is for MySQL users who want to get their hands dirty with Hadoop, so roll up your sleeves and prepare for work. Why would you ever want to move MySQL data into Hadoop? One good reason is archival and analytics. You might not want to delete old data, but rather move it into Hadoop and make it available for further analysis at a later stage.

In this post, we are going to deploy a Hadoop Cluster and export data in bulk from a Galera Cluster using Apache Sqoop. Sqoop is a well-proven approach for bulk data loading from a relational database into Hadoop File System. There is also Hadoop Applier available from …

[Read more]

Nov

2013

MariaDB CONNECT Storage Engine replay & slides available

Posted by Serge Frezefond on Fri 08 Nov 2013 14:10 UTC
Tags:

scalability, analytics, sharding, BI, mariadb, MySQL

The slides and replay of yesterday’s webinar on the MariaDB CONNECT storage engine have just been posted. First I want to thank the numerous attendees. You have shown great interest on the parallel execution of query on distributed MySQL Servers. I agree this is cool. The ODBC capabilities seems also to generate interest. This make [...]

Nov

2013

MariaDB CONNECT Storage Engine and parallelism

Posted by Serge Frezefond on Tue 05 Nov 2013 17:16 UTC
Tags:

community, cluster, scalability, analytics, sharding, BI, MySQL

The CONNECT Storage engine implement the concept of a table made of multiple tables. These underlying tables can be distributed remotely. For example the underlying remote tables can be of ODBC or MySQL table type. this allows to execute distributed queries. What is nice is that we can execute this distributed query with parallelism.

How [...]

Oct

2013

InfiniDB column store moves to open source ! Congrats !

Posted by Serge Frezefond on Tue 15 Oct 2013 22:04 UTC
Tags:

scalability, analytics, BI, MySQL, Performance

Like TokuDB, InfiniDB is now a fully open source server product. In the past infiniDB was “almost open source”. The open source version was an old release with no access to the advance functions like MPP multi-server execution. This is no more the case. With InfiniDB 4 the open source version is the latest release [...]

Sep

2013

Data Analytics at NBCUniversal. Interview with Matthew Eric Bassett.

Posted by Roberto V. Zicari on Mon 23 Sep 2013 14:48 UTC
Tags:

Open Source, Uncategorized, Python, amazon, cloud, analytics, hadoop, MapReduce, big data, NoSQL, MySQL, cloud stores, nosql databases, relational databases, Amazon's EC, Elastic MapReduce, Matthew Eric Bassett, NBCUniversal

“The most valuable thing I’ve learned in this role is that judicious use of a little bit of knowledge can go a long way. I’ve seen colleagues and other companies get caught up in the “Big Data” craze by spend hundreds of thousands of pounds sterling on a Hadoop cluster that sees a few megabytes [...]

Top Authors

Oracle MySQL Blogs

Vendor Blogs

MySQL Links