As described in the first article of this series, Tungsten Replicator can replicate data from MySQL to Vertica in real-time. We use a new batch loading feature that applies transactions to data warehouses in very large blocks using COPY or LOAD DATA INFILE commands. This second and concluding article walks through the details of setting up and testing MySQL to Vertica replication.
To keep the article reasonably short, I assume that readers are conversant with MySQL, Tungsten, and Vertica. Basic replication setup is not hard if you follow all the steps described here, but of course there are variations in every setup. For more information on Tungsten check out the Tungsten Replicator[Read more...]
Real-time analytics allow companies to react rapidly to changing business conditions. Online ad services process click-through data to maximize ad impressions. Retailers analyze sales patterns to identify micro-trends and move inventory to meet them. The common theme is speed: moving lots of information without delay from operational systems to fast data warehouses that can feed reports back to users as quickly as possible.
Real-time data publishing is a classic example of a big data replication problem. In this two-part article I will describe recent work on Tungsten Replicator to move data out of MySQL into Vertica at high speed with minimal load on DBMS [Read more...]
If you're in the Los Angeles area on Feb 15, come hear my talk at LAMySQL inspired by learnings from real-life experiences. In addition to hearing a very unique and interesting talk, you can win an AppleTV thanks to awesome folks at @NoodleYard.
Data is the most valuable asset of an organization because it's irreplaceable.
Yet, we hear about f**k ups related to data administration every day by startups and organizations of all sizes. Sometimes it's no one's fault. Sometimes it's the fault of a drunk friend who shouldn't have been [wherever he was] at the first place.
Yet, at other times, the disaster could have been prevented. Sometimes, these f**k ups are [Read more...]
Googling around, I came across Bradford Cross' article, Big Data Is Less About Size, And More About Freedom. Bradford writes, " The scale of data and computations is an important issue, but the data age is less about the raw size of your data, and more about the cool stuff you can do with it."
Even though the article makes some good points, I'm not sure I can agree with Bradford's point of view here. As an architect, when I think in terms of Big Data, the ability to do "cool stuff" is probably the last thing that crosses my mind. Big Data, to me, is about ensuring constant response time as the data grows in size without sacrificing functionality.
What do you think Big Data is about? Is it merely about being able to do 'cool stuff' with your data? Is it about ensuring [Read more...]
Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.