Showing entries 31 to 40 of 188
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: big data (reset)
Thoughts on Small Datum – Part 1

A little background…

When I ventured into sales and marketing (I’m an engineer by education) I learned I would often have to interpret and simply summarize the business value that is sometimes hidden in benchmarks. Simply put, the people who approve the purchase of products like TokuDB® and TokuMX™ appreciate the executive summary.

Therefore, I plan to publish a multipart series here on TokuView where I will share my simple summaries and thoughts on business value for the benchmarks Mark Callaghan (@markcallaghan), a former Google and now Facebook database guru, is publishing on his blog, Small Datum.

I’m going to start with his first benchmark post and work my way forward to …

[Read more]
Tungsten Replicator 3.0 is Cloudera Enterprise 5 Certified

One of the key platforms I’ve been testing on for the MySQL to Hadoop replication has been Cloudera, largely driven by customer requirements, but it’s also one of the easiest way to get started with Hadoop.

What I’m even more pleased about is the fact that we are proud to announce that Tungsten Replicator 3.0 is certified for use on the new Cloudera Enterprise 5 platform. That means that we’re sure that replicating your data from MySQL to Cloudera 5 and have it work without causing problems or difficulties on the Hadoop loading and materialisation.

Cloudera is a great product, and we’re very happy to be working so effectively with the new Cloudera Enterprise 5. Cloudera …

[Read more]
Continuent Replication to Hadoop – Now in Stereo!

Hopefully by now you have already seen that we are working on Hadoop replication. I’m happy to say that it is going really well. I’ve managed to push a few terabytes of data and different data sets through into Hadoop on Cloudera, HortonWorks, and Amazon’s Elastic MapReduce (EMR). For those who have been following my long association with the IBM InfoSphere BigInsights Hadoop product, and I’m pleased to say that it’s working there too. I’ve had to adapt Robert’s original script to work with the different versions of the underlying Hadoop tools and systems to make it compatible. The actual performance and process is unchanged; you just use a different JS-based batchloader script to work with different tools.

Robert has also been simplifying some of the core functionality, such as configuring some fixed pre-determined formats, so you no longer have to explicitly set the field and record separators.

I’ve also been …

[Read more]
Real-Time Data Loading from MySQL to Hadoop using Tungsten Replicator 3.0 Webinar

To follow-up and describe some of the methods and techniques behind replicating into Hadoop from MySQL in real-time, and how this can be combined into your data workflow, Continuent are running a webinar with me presenting that will go over the details and provide a demo of the data replication process.

Real-Time Data Loading from MySQL to Hadoop with New Tungsten Replicator 3.0

Hadoop is an increasingly popular means of analyzing transaction data from MySQL. Up until now mechanisms for moving data between MySQL and Hadoop have been rather limited. The new Continuent Tungsten Replicator 3.0 provides enterprise-quality replication from MySQL to Hadoop. Tungsten Replicator 3.0 is 100% open source, released under a GPL V2 license, and available for download at Continuent Tungsten handles MySQL transaction …

[Read more]
March 20 Webinar: How to Scale MySQL for Big Data Applications

You may think that you have to buy, install, and get up to speed on a new database if you want to work with large amounts of data, but you can do more than you think with the MySQL you already have.
Register Now!

SPEAKER: Jon Tobin, Tokutek
DATE: Thursday, March 20th
TIME: 1pm ET

Without having to change your application or do special tuning you can increase performance and save significant time and money when you need to scale.

Join Tokutek’s Jon Tobin as he demonstrates how to use MySQL or MariaDB in Big Data applications by simply upgrading the storage engine with TokuDB, and how to effectively evaluate TokuDB for increased performance, compression and agility.

During this webinar you will learn:

  • How to dramatically increase …
[Read more]
Big Data: Three questions to Aerospike.

“Many tools now exist to run database software without installing software. From vagrant boxes, to one click cloud install, to a cloud service that doesn’t require any installation, developer ease of use has always been a path to storage platform success.”–Brian Bulkowski.

The fifth interview in the “Big Data: three questions to “ series of interviews, is with Brian Bulkowski, Aerospike co-founder and CTO.


Q1. What is your current product offering?

Brian Bulkowski: Aerospike is the first in-memory NoSQL database optimized for flash or solid state drives (SSDs).
In-memory for speed and NoSQL for scale. Our approach to memory is unique – we have built our own file system to access …

[Read more]
Real-Time Replication from MySQL to Cassandra

Earlier this month I blogged about our new Hadoop applier, I published the docs for that this week ( as part of the Tungsten Replicator 3.0 documentation ( It contains some additional interesting nuggets that will appear in future blog posts.

The main part of that functionality that performs the actual applier for Hadoop is based around a JavaScript applier engine – there will eventually be docs for that as part of the Batch Applier content ( …

[Read more]
MySQL Cluster to Hadoop

How do you get data from a MySQL Cluster into Hadoop? Easy, replicate from the cluster to a stand alone MySQL instance and from there use the MySQL Hadoop Applier to HDFS.

This question came from a long time MySQL user who has jumped into the Big Data world.

No Hadoop Fun for Me at SCaLE 12X :(

I blogged a couple of weeks ago about my upcoming MySQL/Hadoop talk at SCaLE 12X. Unfortunately I had to cancel. A few days after writing the article I came down with an eye problem that is fixed but prevents me from flying anywhere for a few weeks. That's a pity as I was definitely looking forward to attending the conference and explaining how Tungsten replicates transactions from MySQL into HDFS.

Meanwhile, we are still moving at full steam with Hadoop-related work at Continuent, which is the basis for the next major replication release, Tungsten Replicator 3.0.0. Binary builds and documentation will go up in a few days. There will also be many more public talks about Hadoop support, starting in April at …

[Read more]
Why Aren't All Data Immutable?

Over the last few years there has been an increasing interest in immutable data management. This is a big change from the traditional update-in-place approach many database systems use today, where new values delete old values, which are then lost. With immutable data you record everything, generally using methods that append data from successive transactions rather than replacing them.  In some DBMS types you can access the older values, while in others the system transparently uses the old values to solve useful problems like implementing eventual consistency.

Baron Schwartz recently pointed out that it can be hard to get decent transaction processing performance based on append-only methods like append-only B-trees.  This is not a very strong argument against immutable data per se. …

[Read more]
Showing entries 31 to 40 of 188
« 10 Newer Entries | 10 Older Entries »