Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 30 of 124 Next 30 Older Entries

Displaying posts with tag: hadoop (reset)

"Minute-to-win-it" Blue Studio by Beats - Heterogenous Replication Survey
+0 Vote Up -0Vote Down
Continuent would like to better understand the relationships and data flows that exist between different database systems that you are using to understand your replication and data integration needs better. In particular, we'd like to know about any heterogeneous data exchanges, including manual dump/load and automated process, and whether non-database sources, such as Twitter and Facebook,
Reflections on return to MySQL Community and Ecosystem
+1 Vote Up -0Vote Down
After a four year hiatus, my participation in last week’s Percona Live MySQL Users conference marked my official return to the MySQL Community and Ecosystem. As with earlier renditions this year’s “UC” was very well attended with a healthy mix of familiar faces and new blood, all coming together to discuss, present and explore the boundaries of the most popular and widely used open source database on the planet.  There were many good, informative keynote and technical sessions, BoFs and the exhibit hall was packed most of the operating hours with those interested in what the MySQL ecosystem is up to.  I also found it very refreshing that Oracle was among the most active in presenting useful technical content around their current and future MySQL open source product releases. All in  [Read more...]
PerconaLive Keynote: Getting Serious about MySQL and Hadoop at Continuent
+0 Vote Up -0Vote Down
Lean, mean MySQL and hulking Hadoop clusters may seem like an odd couple, but tying them together is now priority #1 for many MySQL users. This keynote talk on 1st day of this year's Percona Live MySQL Conference & Expo 2014 explores the data management trends spurring integration, how the MySQL community is stepping up, and where the integration may go in the future. Robert Hodges, CEO at
Tungsten Replicator 3.0 is Cloudera Enterprise 5 Certified
+0 Vote Up -0Vote Down

One of the key platforms I’ve been testing on for the MySQL to Hadoop replication has been Cloudera, largely driven by customer requirements, but it’s also one of the easiest way to get started with Hadoop.

What I’m even more pleased about is the fact that we are proud to announce that Tungsten Replicator 3.0 is certified for use on the new Cloudera Enterprise 5 platform. That means that we’re sure that replicating your data from MySQL to Cloudera 5 and have it work without causing problems or difficulties on the Hadoop

  [Read more...]
Continuent Replication to Hadoop – Now in Stereo!
+0 Vote Up -0Vote Down

Hopefully by now you have already seen that we are working on Hadoop replication. I’m happy to say that it is going really well. I’ve managed to push a few terabytes of data and different data sets through into Hadoop on Cloudera, HortonWorks, and Amazon’s Elastic MapReduce (EMR). For those who have been following my long association with the IBM InfoSphere BigInsights Hadoop product, and I’m pleased to say that it’s working there too. I’ve had to adapt Robert’s original script to work with the different versions of the underlying Hadoop tools and systems to make it compatible. The actual performance and process is unchanged; you just use a different JS-based batchloader script to work with different tools.

Robert has also been simplifying some of the core functionality, such as configuring some fixed pre-determined

  [Read more...]
Webinar-on-Demand: Real-Time Data Loading from MySQL to Hadoop
+0 Vote Up -0Vote Down
Hadoop is an increasingly popular means of analyzing transaction data from single MySQL or multiple MySQL servers. Up until now mechanisms for moving data between MySQL and Hadoop have been rather limited. The new Continuent Tungsten Replicator 3.0 provides enterprise-quality replication from MySQL to Hadoop. Tungsten Replicator 3.0 is 100% open source, released under a GPL V2 license, and
Don't miss these Tungsten talks at Percona Live MySQL Conference & Expo
+0 Vote Up -0Vote Down
Keynotes and Sessions: Keynote: Getting Serious about MySQL and Hadoop at ContinuentRobert Hodges (CEO, Continuent) Hadoop for MySQL PeopleChris Schneider (Database Architect, Groupon.com) From Dolphins to Elephants: Real-Time MySQL to Hadoop ReplicationMC Brown (Director of Documentation, Continuent), Linas Virbalas (Senior Software Engineer, Continuent) Virtually Available MySQL, or How to
We're hiring!
+1 Vote Up -0Vote Down
Continuent, a leading provider of database clustering and replication software has five (5) new positions open: Build/Test Engineer Senior Database Availability and Clustering Engineer Senior Database Replication Engineer Data Replication Sales Engineer Clustering and Replication Test Development Engineer If you want to get in on the ground floor of a growing company in a challenging field
Real-Time Data Loading from MySQL to Hadoop using Tungsten Replicator 3.0 Webinar
+0 Vote Up -0Vote Down

To follow-up and describe some of the methods and techniques behind replicating into Hadoop from MySQL in real-time, and how this can be combined into your data workflow, Continuent are running a webinar with me presenting that will go over the details and provide a demo of the data replication process.

Real-Time Data Loading from MySQL to Hadoop with New Tungsten Replicator 3.0

Hadoop is an increasingly popular means of analyzing transaction data from MySQL. Up until now mechanisms for moving data between MySQL and Hadoop have been rather limited. The new Continuent Tungsten Replicator 3.0 provides enterprise-quality replication from MySQL to Hadoop. Tungsten Replicator 3.0 is 100% open source, released under a GPL V2 license, and available for download at

  [Read more...]
MySQL to Hadoop Step-By-Step
+0 Vote Up -0Vote Down

We had a great webinar on Thursday about replicating from MySQL to Hadoop (watch the whole thing). It was great, but one of the questions at the end was ‘is there an easy way to test’.

Sadly we can’t go giving out convenient ready-to-run downloads of these things because of licensing and and other complexities, so I want to try and make it as simple and straightforward as possible by giving you the directions to complete. I’m going to be point to the Continuent Documentation every now and then so this is not too crowded, but we should get through it pretty easily.

Major Decisions

For this to work: 

  • We’ll setup two VMs, one the master
  [Read more...]
Real-time data loading from MySQL to Hadoop
+0 Vote Up -0Vote Down
Hadoop is an increasingly popular means of analyzing transaction data from MySQL. Up until now mechanisms for moving data between MySQL and Hadoop have been rather limited. Continuent Tungsten Replicator provides enterprise-quality replication from MySQL to Hadoop under a GPL V2 license.  Continuent Tungsten handles MySQL transaction types including INSERT/UPDATE/DELETE operations and can
MongoDB and Hadoop - Stockholm MongoDB User Group Meetup - Monday, March 3, 2014
+0 Vote Up -0Vote Down
February 27, 2014 By Severalnines

 

Stockholm MongoDB User Group Meetup: “MongoDB and Hadoop”

Monday, March 3, 2014 starting @ 5:00 PM

 

Join us next Monday as we host the Stockholm MongoDB User Group Meetup in Kista, or the Wireless Valley as it is also referred to. 

 

Our very own Vinay Joosery will be speaking about how to best automate the management & deployment of database clusters, specifically MongoDB clusters though the same principles apply for MySQL, MariaDB and Percona XtraDB based clusters. Henrik Ingo of MongoDB will be talking about Analytics with MongoDB & Hadoop. And Jim Dowling, a Senior Researcher at the Swedish Institute of Computer Science, will talk

  [Read more...]
No Hadoop Fun for Me at SCaLE 12X :(
+0 Vote Up -0Vote Down
I blogged a couple of weeks ago about my upcoming MySQL/Hadoop talk at SCaLE 12X. Unfortunately I had to cancel. A few days after writing the article I came down with an eye problem that is fixed but prevents me from flying anywhere for a few weeks. That's a pity as I was definitely looking forward to attending the conference and explaining how Tungsten replicates transactions from MySQL into HDFS.

Meanwhile, we are still moving at full steam with Hadoop-related work at Continuent, which is the basis for the next major replication release, Tungsten Replicator 3.0.0. Binary builds and documentation will go up in a few days. There will also be many more public talks about Hadoop support, starting in

  [Read more...]
Why Aren't All Data Immutable?
+0 Vote Up -0Vote Down
Over the last few years there has been an increasing interest in immutable data management. This is a big change from the traditional update-in-place approach many database systems use today, where new values delete old values, which are then lost. With immutable data you record everything, generally using methods that append data from successive transactions rather than replacing them.  In some DBMS types you can access the older values, while in others the system transparently uses the old values to solve useful problems like implementing eventual consistency.

Baron Schwartz recently pointed out that it can be hard to get decent transaction processing performance based on append-only methods like append-only B-trees.  This is not a very strong argument

  [Read more...]
Getting Data into Hadoop in real-time
+0 Vote Up -0Vote Down

Moving data between databases is hard. Without ever intending it, I seem to have spent a lifetime working on solutions for getting data into and out of databases, but more frequently between. In fact, my first job out of university was migrating data from BRS/Text, a free-text database (probably what we would call a NoSQL) into a more structured Oracle.

Today I spend some of my time working in Big Data, more often than not, migrating information from existing data stores into Big Data so that they can be analysed, something I covered in more detail here:

http://www.ibm.com/developerworks/library/bd-sqltohadoop1/index.html
http://www.ibm.com/developerworks/library/bd-sqltohadoop2/index.html


  [Read more...]
Fun with MySQL and Hadoop at SCaLE 12X
+0 Vote Up -0Vote Down
It's my pleasure to be presenting at SCaLE 12X on the subject of real-time data loading from MySQL to Hadoop.  This is the first public talk on work at Continuent that enables Tungsten Replicator to move transactions from MySQL to HDFS (Hadoop Distributed File System).  I will explain how replication to Hadoop works, how to set it up, and offer a few words on constructing views of MySQL data using tools like Hive.

As usual with replication everything we are doing on Hadoop replication is open source.  Builds and documentation will be

  [Read more...]
Replicate from Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
+0 Vote Up -0Vote Down
Oracle is the most powerful DBMS in the world. However, Oracle's expensive and complex replication makes it difficult to build highly available applications or move data in real-time to data warehouses and popular databases like MySQL. In this webinar-on-demand you will learn how Continuent Tungsten solves problems with Oracle replication at a fraction of the cost of other solutions and with less
SQL to Hadoop and back again, Part 3: Direct transfer and live data exchange
+1 Vote Up -0Vote Down

The third, and final article in my series on migrating data to and from Hadoop and SQL databases is now available:

Big data is a term that has been used regularly now for almost a decade, and it — along with technologies like NoSQL — are seen as the replacements for the long-successful RDBMS solutions that use SQL. Today, DB2®, Oracle, Microsoft® SQL Server MySQL, and PostgreSQL dominate the SQL space and still make up a considerable proportion of the overall market. In this final article of the series, we will look at more automated solutions for migrating data to and from Hadoop. In the previous articles, we concentrated on methods that take exports or otherwise formatted and extracted data from your SQL source, load that into Hadoop in some way, then process or parse it. But if you want to analyze big data,

  [Read more...]
Copying MySQL Data to Hadoop with Minimal Loss of Blood Part 2
Employee +2 Vote Up -0Vote Down

I have spent the better part of the last month at Big Data conferences trying to see behind the $2.5 million in marketing smoke to see what is really going to be showing up on the to-do list of DBAs. The first bit of news is that half the vendors at shows like Strata or Big Data Techon will probably be gone by this time next year. So picking a vendor right now is a little iffy. Hadoop’s ecosystem is flourishing and will surely be around for some time but the vendors are playing musical chairs.

But we are Open Source and we do not need vendors! Well, yes and no. The good folks at Cloudera and Horton Works have done you a big favor by providing wonderful tutorials that are worth your time to see. Recently two former MySQL-ers, Sarah Sproehnle and Ian Wrigley, have put together

  [Read more...]
Big Data Tools that You Need to Know About – Hadoop & NoSQL – Part 2
+0 Vote Up -0Vote Down

 

In the previous article we introduced Hadoop as the most popular Big Data toolset on the market today. We had just started talking about MapReduce as the major framework that makes Hadoop distinctive. So let’s continue the discussion where we left off.

 

MapReduce is really the key to understanding Hadoop’s parallel processing functionality as it enables data in various formats (XML, text, binary, log, SQL, ect) to be divided up and mapped out to many computers nodes and then recombined back to produce a final data set.

 

  [Read more...]
New MySQL features, related technologies at Percona Live London
+0 Vote Up -0Vote Down

The upcoming Percona Live London conference, November 11-12, features quite a number of talks about the latest MySQL features and related technologies. There will be a lots of talks about the new MySQL 5.6 features:

  [Read more...]
SQL to Hadoop and back again, Part 1: Basic data interchange techniques
+0 Vote Up -0Vote Down

I’ve got a new article, which is part of a new three-part series, on moving data between SQL and Hadoop, both the export to Hadoop and importing processed content back into an SQL store.

In this first one, we look at the basic mechanics and considerations before you start the migration of data, such as the data format, content, and export techniques.

Read: SQL to Hadoop and back again, Part 1: Basic data interchange techniques


Data Analytics at NBCUniversal. Interview with Matthew Eric Bassett.
+0 Vote Up -0Vote Down
“The most valuable thing I’ve learned in this role is that judicious use of a little bit of knowledge can go a long way. I’ve seen colleagues and other companies get caught up in the “Big Data” craze by spend hundreds of thousands of pounds sterling on a Hadoop cluster that sees a few megabytes [...]
Percona Live London 2013: an insider’s view of the schedule
+1 Vote Up -0Vote Down

With the close of call for papers earlier this month, the Percona Live London conference committee was in full swing this past week reviewing all of the many submissions for November’s Percona Live London MySQL Conference.

The submissions are far ranging and cover some really interesting topics, making the lineup for Percona Live London really strong! What the committee looks for in a submission is how much “value” a talk will bring to the

  [Read more...]
MySQL webinar: ‘Introduction to open source column stores’
+1 Vote Up -0Vote Down

Join me Wednesday, September 18 at 10 a.m. PDT for an hour-long webinar where I will introduce the basic concepts behind column store technology. The webinar’s title is: “Introduction to open source column stores.”

What will be discussed?

This webinar will talk about Infobright, LucidDB, MonetDB, Hadoop (Impala) and other column stores

  • I will compare features between major column stores (both open and closed source).
  • Some benchmarks will be used to demonstrate the basic
  [Read more...]
Copying MySQL Data to Hadoop with Minimal Loss of Blood Part 1
Employee +1 Vote Up -0Vote Down

Ask ten DBAs for a definition of ‘Big Data’ and you well get more than ten replies. And the majority of those replies will lead you to Hadoop. Hadoop has been the most prominent of the big data frameworks in the open source world. Over 80% of the Hadoop instances in the world are feed their data from MySQL1. But Hadoop is made up of many parts, some confusing and many that do not play nicely with each other. It is analogous to being given a pile of automotive parts from different models and tyring to come up with a car at the end of the day. So what if you do if you are wanting to copy some of your relational data into Hadoop and want to avoid the equivilent of scraped knuckles? The answer is Bigtop and what follows is a way to get a one node does all system running so you can experiement with Hadoop, Map/Reduce, Hive, and all

  [Read more...]
Big Data with MySQL and Hadoop at MySQL Connect 2013
+1 Vote Up -0Vote Down

I will be talking about Big Data with MySQL and Hadoop at MySQL Connect 2013 (Sept. 21-22) in San Francisco as well as at Percona University at Washington, DC (September 12, 2013). Apache Hadoop is a very popular Big Data solution and we can nowadays easily integrate it with MySQL. I will start with a brief introduction of Apache Hadoop and its components (HFDS, Map/Reduce, Hive, HBase/HCatalog, Flume, Scoop, etc). Next I will show 2 major Big Data scenarios:

  • From file to Hadoop to MySQL. This is an example of “ELT” process: Extract data from external source; Load data into Hadoop; Transform
  [Read more...]
MySQL and Hadoop integration
+0 Vote Up -0Vote Down

Dolphin and Elephant: an Introduction

This post is intended for MySQL DBAs or Sysadmins who need to start using Apache Hadoop and want to integrate those 2 solutions. In this post I will cover some basic information about the Hadoop, focusing on Hive as well as MySQL and Hadoop/Hive integration.

First of all, if you were dealing with MySQL or any other relational database most of your professional life (like I was), Hadoop may look different. Very different. Apparently, Hadoop is the opposite to any relational database. Unlike the database where we have a set of tables and indexes, Hadoop works with a set of text files. And… there are no indexes at all. And yes, this may be shocking,

  [Read more...]
On Oracle NoSQL Database –Interview with Dave Segleau.
+0 Vote Up -0Vote Down
“We went down the path of building Oracle NoSQL database because of explicit request from some of our largest Oracle Berkeley DB installations that wanted to move away from maintaining home grown sharding implementations and very much wanted an out of box technology that can replicate the robustness of what they had built “out of [...]
What technologies are you running alongside MySQL?
+2 Vote Up -0Vote Down

In many environments MySQL is not the only technology used to store in-process data.

Quite frequently, especially with large-scale or complicated applications, we use MySQL alongside other technologies for certain tasks of reporting, caching as well as main data-store for portions of application.

What technologies for data storage and processing do you use alongside MySQL in your environment? Please feel free to elaborate in the comments about your use case and experiences!

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

The post

  [Read more...]
Showing entries 1 to 30 of 124 Next 30 Older Entries

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.