Showing entries 21 to 30 of 70
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: analytics (reset)
Resources for Database Clusters: Performance Tuning for HAProxy, Support for MariaDB 10, Technical Blogs & More

August 28, 2014 By Severalnines Check Out Our Latest Resources for MySQL, MariaDB & MongoDB Clusters

 

Here is a summary of resources & tools that we’ve made available to you in the past weeks. If you have any questions on these, feel free to contact us!

 

New Technical Webinars

 

Performance Tuning of HAProxy for Database Load Balancing

09 September 2014 - with Baptiste Assmann of HAProxy Technologies

Do you know what HAProxy can tell you about your application and database instances? Do you know the difference …

[Read more]
Making Real-Time Analytics a Reality — TDWI -The Data Warehousing Institute

My article on how to make the real-time processing of information from traditional transactional stores into Hadoop a reality has been published over at TDWI:

Making Real-Time Analytics a Reality — TDWI -The Data Warehousing Institute.


Filed under: Articles Tagged: analytics, big data, data migration, databases, hadoop, mysql, …

[Read more]
Big Data Integration & ETL - Moving Live Clickstream Data from MongoDB to Hadoop for Analytics

June 16, 2014 By Severalnines

MongoDB is great at storing clickstream data, but using it to analyze millions of documents can be challenging. Hadoop provides a way of processing and analyzing data at large scale. Since it is a parallel system, workloads can be split on multiple nodes and computations on large datasets can be done in relatively short timeframes. MongoDB data can be moved into Hadoop using ETL tools like Talend or Pentaho Data Integration (Kettle).

 

In this blog, we’ll show you how to integrate your MongoDB and Hadoop datastores using Talend. We have a MongoDB database collecting clickstream data from several websites. We’ll create a job in Talend to extract the documents from MongoDB, transform and then load them into HDFS. We will also show you how to schedule this job to be executed every 5 minutes.

 

Test Case

 

We have an application …

[Read more]
Archival and Analytics - Importing MySQL data into Hadoop Cluster using Sqoop

May 16, 2014 By Severalnines

We won’t bore you with buzzwords like volume, velocity and variety. This post is for MySQL users who want to get their hands dirty with Hadoop, so roll up your sleeves and prepare for work. Why would you ever want to move MySQL data into Hadoop? One good reason is archival and analytics. You might not want to delete old data, but rather move it into Hadoop and make it available for further analysis at a later stage. 

 

In this post, we are going to deploy a Hadoop Cluster and export data in bulk from a Galera Cluster using Apache Sqoop. Sqoop is a well-proven approach for bulk data loading from a relational database into Hadoop File System. There is also Hadoop Applier available from …

[Read more]
MariaDB CONNECT Storage Engine replay & slides available

The slides and replay of yesterday’s webinar on the MariaDB CONNECT storage engine have just been posted. First I want to thank the numerous attendees. You have shown great interest on the parallel execution of query on distributed MySQL Servers. I agree this is cool. The ODBC capabilities seems also to generate interest. This make [...]

MariaDB CONNECT Storage Engine and parallelism

The CONNECT Storage engine implement the concept of a table made of multiple tables. These underlying tables can be distributed remotely. For example the underlying remote tables can be of ODBC or MySQL table type. this allows to execute distributed queries. What is nice is that we can execute this distributed query with parallelism.

How [...]

InfiniDB column store moves to open source ! Congrats !

Like TokuDB, InfiniDB is now a fully open source server product. In the past infiniDB was “almost open source”. The open source version was an old release with no access to the advance functions like MPP multi-server execution. This is no more the case. With InfiniDB 4 the open source version is the latest release [...]

Data Analytics at NBCUniversal. Interview with Matthew Eric Bassett.

“The most valuable thing I’ve learned in this role is that judicious use of a little bit of knowledge can go a long way. I’ve seen colleagues and other companies get caught up in the “Big Data” craze by spend hundreds of thousands of pounds sterling on a Hadoop cluster that sees a few megabytes [...]

MySQL webinar: ‘Introduction to open source column stores’

Join me Wednesday, September 18 at 10 a.m. PDT for an hour-long webinar where I will introduce the basic concepts behind column store technology. The webinar’s title is: “Introduction to open source column stores.”

What will be discussed?

This webinar will talk about Infobright, LucidDB, MonetDB, Hadoop (Impala) and other column stores

  • I will compare features between major column stores (both open and closed source).
  • Some benchmarks will be used to demonstrate the basic performance characteristics of the open source column stores.
  • There will be a question and answer session to ask me anything you like about column stores (you can also ask in the …
[Read more]
On PostgreSQL. Interview with Tom Kincaid.

“Application designers need to start by thinking about what level of data integrity they need, rather than what they want, and then design their technology stack around that reality. Everyone would like a database that guarantees perfect availability, perfect consistency, instantaneous response times, and infinite throughput, but it´s not possible to create a product with [...]

Showing entries 21 to 30 of 70
« 10 Newer Entries | 10 Older Entries »