Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 11

Displaying posts with tag: Coalface (reset)

Revisiting ZFS and MySQL
+1 Vote Up -0Vote Down

While at Percona Live this year I was reminded about ZFS and running MySQL on top of a ZFS-based storage platform.

Now I’m a big fan of ZFS (although sadly I don’t get to use it as much as I used to after I shutdown my home server farm), and I did a lot of different testing back while at MySQL to ensure that MySQL, InnoDB and ZFS worked correctly together.

Of course today we have a completely new range of ZFS compatible environments, not least of which are FreeBSD and ZFS on Linux, I think it’s time to revisit some of my original advice on using this combination.

Unfortunately the presentations and MySQL University sessions back then have all been taken down. But that doesn’t mean the advice is any less valid.

Some of the core advice for using InnoDB on ZFS:

  • Configure a single InnoDB tablespace,
  [Read more...]
Tungsten Replicator 3.0 is Cloudera Enterprise 5 Certified
+0 Vote Up -0Vote Down

One of the key platforms I’ve been testing on for the MySQL to Hadoop replication has been Cloudera, largely driven by customer requirements, but it’s also one of the easiest way to get started with Hadoop.

What I’m even more pleased about is the fact that we are proud to announce that Tungsten Replicator 3.0 is certified for use on the new Cloudera Enterprise 5 platform. That means that we’re sure that replicating your data from MySQL to Cloudera 5 and have it work without causing problems or difficulties on the Hadoop

  [Read more...]
Continuent Replication to Hadoop – Now in Stereo!
+0 Vote Up -0Vote Down

Hopefully by now you have already seen that we are working on Hadoop replication. I’m happy to say that it is going really well. I’ve managed to push a few terabytes of data and different data sets through into Hadoop on Cloudera, HortonWorks, and Amazon’s Elastic MapReduce (EMR). For those who have been following my long association with the IBM InfoSphere BigInsights Hadoop product, and I’m pleased to say that it’s working there too. I’ve had to adapt Robert’s original script to work with the different versions of the underlying Hadoop tools and systems to make it compatible. The actual performance and process is unchanged; you just use a different JS-based batchloader script to work with different tools.

Robert has also been simplifying some of the core functionality, such as configuring some fixed pre-determined

  [Read more...]
Parallel Extractor for Provisioning
+0 Vote Up -0Vote Down

Coming up as a new feature in Tungsten Replicator (and written by our replicator expert Stephane Giron) is the ability to provision a new database by using data from an existing database. This new feature comes in the form of a tool called the Parallel Extractor.

The principles are very simple. On the master side:

  • Start the master replicator offline.
  • Switch the replicator to the online provision state.
  • The master replicator pulls the data out of the existing database and writes that information into the Transaction History Log (THL). At this point, the normal replicator thread is not extracting events from the source database.
  • Once the parallel replication has completed, the replicator switches over to normal extraction mode, and starts writing change data into the THL.

On the slave side, the THL events are read

  [Read more...]
MySQL to Hadoop Step-By-Step
+0 Vote Up -0Vote Down

We had a great webinar on Thursday about replicating from MySQL to Hadoop (watch the whole thing). It was great, but one of the questions at the end was ‘is there an easy way to test’.

Sadly we can’t go giving out convenient ready-to-run downloads of these things because of licensing and and other complexities, so I want to try and make it as simple and straightforward as possible by giving you the directions to complete. I’m going to be point to the Continuent Documentation every now and then so this is not too crowded, but we should get through it pretty easily.

Major Decisions

For this to work: 

  • We’ll setup two VMs, one the master
  [Read more...]
Real-Time Replication from MySQL to Cassandra
+1 Vote Up -0Vote Down

Earlier this month I blogged about our new Hadoop applier, I published the docs for that this week (http://docs.continuent.com/tungsten-replicator-3.0/deployment-hadoop.html) as part of the Tungsten Replicator 3.0 documentation (http://docs.continuent.com/tungsten-replicator-3.0/index.html). It contains some additional interesting nuggets that will appear in future blog posts.

The main part of that functionality that performs the actual applier for Hadoop is based around a JavaScript applier engine – there will eventually be docs for that as part of the Batch Applier content (

  [Read more...]
Getting Data into Hadoop in real-time
+0 Vote Up -0Vote Down

Moving data between databases is hard. Without ever intending it, I seem to have spent a lifetime working on solutions for getting data into and out of databases, but more frequently between. In fact, my first job out of university was migrating data from BRS/Text, a free-text database (probably what we would call a NoSQL) into a more structured Oracle.

Today I spend some of my time working in Big Data, more often than not, migrating information from existing data stores into Big Data so that they can be analysed, something I covered in more detail here:

http://www.ibm.com/developerworks/library/bd-sqltohadoop1/index.html
http://www.ibm.com/developerworks/library/bd-sqltohadoop2/index.html


  [Read more...]
Anonymizing Data During Replication
+0 Vote Up -0Vote Down

If you happen to work with personal data, chances are you are subject to SOX (Sarbanes-Oxley) whether you like it or not.

One of the worst aspects of this is that if you want to be able to analyse your data and you replicate out to another host, you have to find a way of anonymizing the information. There are of course lots of ways of doing this, but if you are replicating the data, why not anonymize it during the replication?

Of the many cool features in Tungsten Replicator, one of my favorites is filtering. This allows you to process the stream of changes that are coming from the data extracted from the master and perform operations on it. We use it a lot in the replicator for ignoring tables, schemas and columns, and for ensuring that we have the correct information within the THL.

Given this, let’s use it to anonymize the data as it is being

  [Read more...]
Percona Live 2013, MySQL, Continuent and an ever-healthy Ecosystem
+0 Vote Up -0Vote Down

I’m sitting here in the lounge at SFO thinking back on the last week, the majority of which has been spent meeting my new workmates and attending the Percona MySQL conference.

For me it has been as much of a family reunion as it has been about seeing the wonderful things going on in MySQL.

Having joined Continuent last month after an ‘absence’ in NoSQL land of almost 2.5 years, joining the MySQL community again just felt like coming home after a long absence. And that’s no bad thing. On a very personal level it was great to see so many of my old friends, many of whom were not only pleased to see me, but pleased to see me working back in the MySQL fold. Evidently many people think this is where I belong.

What was great to see is that the MySQL community is alive and

  [Read more...]
Joining Continuent
+0 Vote Up -0Vote Down

I’ve just completed my first month here at Continuent, strangely back into the MySQL ecosystem which I have been working in for some time before I joined CouchOne, and then Couchbase, two and half years ago. Making the move back to MySQL is both an experience, and somehow, comfortable…

Continuent produce technology that makes for easier replication between MySQL servers and, more importantly, more flexible solutions when you need to scale out by providing connector and management functionality for your MySQL cluster. That means that you can easily backup, add slaves, and create complex replication scenarios such as multi-master, and even multiple-site, multiple-master topologies. This functionality is split over two products, Continuent Tungsten, which is the cluster management product, and the open source Tungsten Replicator, which provides the basic replication

  [Read more...]
Moving from MySQL to Couchbase
+0 Vote Up -0Vote Down

Before moving to Couchbase and working with NoSQL technology I had for years been a MySQL user. Making that leap from MySQL to NoSQL requires a number of changes, not least of which to the way you structure your data and then query it. 

I’ve tried to distil the first part of that process down into a simpler form and steps in a new blog post oat the Couchbase blog http://blog.couchbase.com/how-move-mysql-couchbase-server-20-part-1


Showing entries 1 to 11

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.