Showing entries 1 to 10 of 292
10 Older Entries »
Displaying posts with tag: TokuView (reset)
Why Percona Acquired Tokutek: by Peter Zaitsev

It is my pleasure to announce that Percona has acquired Tokutek and will take over development and support for TokuDB® and TokuMX™ as well as the revolutionary Fractal Tree® indexing technology that enables those products to deliver improved performance, reliability and compression for modern Big Data applications.

At Percona we have been working with the Tokutek team since 2009, helping to improve performance and scalability. The TokuDB storage engine has been available for Percona Server for about a year, so joining forces is quite a natural step for us.

Fractal Tree indexing technology—developed by years of data science research at MIT, Stony Brook University and Rutgers University—is the new generation data structure which, for many workloads, leapfrogs traditional B-tree technology which was invented in 1972 (over 40 years ago!).  It is also often …

[Read more]
Increasing Cloud Database Efficiency – Like Crows in a Closet

In Mo’ Data, Mo’ Problems, we explored the paradox that “Big Data” projects pose to organizations and how Tokutek is taking an innovative approach to solving those problems. In this post, we’re going to talk about another hot topic in IT, “The Cloud,” and how enterprises undertaking Cloud efforts often struggle with idea of “problem trading.” Also, for some reason, databases are just given a pass as traditionally “noisy neighbors” and that there is nothing that can be done about it. Lets take a look at why we disagree.

With the birth of the information age came a coupling of business and IT. Increasingly strategic business projects and objectives were reliant on information infrastructure to provide information storage and retrieval instead of paper and filing cabinets. This was the dawn of the database and what gave rise to companies like Oracle, Sybase and MySQL. With the appearance of true Enterprise Grade …

[Read more]
TokuDB Table Optimization Improvements

Section I: Fractal Tree and Optimization Overview
Tokutek’s Fractal Tree® technology provides fast performance by injecting small messages into buffers inside the Fractal Tree index. This allows writes to be batched, thus eliminating I/O that is required in traditional B-tree indexes for every operation. Additional background information on how Fractal Trees operate can be found in Zardosht Kasheff’s blog entitled, TokuMX Fractal Tree Indexes, What Are They? Don’t be thrown off by the title, Fractal Tree Indexes access data in the same way for TokuDB as they do for TokuMX.

For tables whose workload pattern is a high number of sequential deletes, some operational maintenance is required to ensure consistently fast performance.  If this is not done, delete messages and garbage can exist in the Fractal …

[Read more]
TokuDB Hot Backup Now a MySQL Plugin

In the recently released TokuDB 7.5.5 the implementation of TokuDB hot-backup moved from a patch to the MySQL Server, to MySQL Plugin.  Why did we make this change?

TokuDB hot backup makes a transactionally consistent copy of the TokuDB files while applications continue to read and write these files.  Christian Rober wrote a nice series of blogs about how hot backup works.  See TokuDB hot backup 1 and TokuDB hot backup 2 for details.  In summary, the TokuDB hot backup library intercepts system calls that write files and duplicates the writes on backup files. It does this while copying files to the backup directory.

There are two changes made to MySQL to get TokuDB hot backup working.

First, the hot backup …

[Read more]
Mo’ Data, Mo’ Problems

Welcome to blog #2 in a series about the benefits of the Fractal Tree. In this post, I’ll be explaining Big Data, why it poses such a problem and how Tokutek can help. Given the fact that I am a lifelong fan of both Hip-hop and Big Data, the title was a no-brainer and, given the artist, a bit of a pun.

 I am as tired as you of hearing the term “Big Data.” It’s so overused, that it ceases to have specific meaning anymore. You see, data hardly ever starts as “big” or a “problem.” Rather, it starts small and easily manageable, but gradually grows to some unimaginable size and becomes a beast in need of slaying, like the irradiated ant from a sci-fi film, growing to the size of a cruise ship. The nature of tackling such a tough problem means that the initial understanding of the factors involved is, oftentimes, incomplete at best; Catch-22 exemplified. During the course of problem it is …

[Read more]
Fractal Tree Greatness: The Nexus

In my recent travels, I’ve been speaking with database users at various meetups and trade shows worldwide. Very often, I got questions centering around the best use cases for our products, be it TokuDB, our MySQL storage engine, or, TokuMX, our distribution of MongoDB. Over 90% of the time, I responded Cloud, Big Data or both. You see, in the software industry we’re like kindergartners, we like things to fit into neat categories. If you know any software sales people, you’ll recognize this as a fitting analogy (at least in terms of energy and attention span), but I digress. This strategy helps allocate resources where they are most likely to make an impact, and, thus, optimize our return on investment. In this blog series, I’m going to go slightly against the grain and explain why the Fractal Tree makes databases work in these environments.


Before I get into more detail, let me …

[Read more]
Testing TokuDB’s Group Commit Algorithm Improvement

The MySQL 5.6 Release has introduced some changes to how two phase commit works and is managed.  In particular, the commit phase of transactions to the binary log is now serialized and this behavior is something we identified fairly immediately.  We implement a group commit algorithm that needed to be altered so that TokuDB’s group commit to its recovery log would function effectively.

As part of our effort to verify the new Binary Log Group Commit functionality introduced in TokuDB 7.5.4 for Percona Server, we wanted to demonstrate the substantial increase in throughput scaling but also show the bottleneck caused by the skewed interaction between the binary log group commit algorithm in MySQL 5.6 and the transaction commit mechanism used in TokuDB 7.5.3 for Percona Server.  During our testing, we noticed that the throughput scaling was diminished when we turned on the binlog.

Here are the relevant system …

[Read more]
Scaling TokuDB Performance with Binlog Group Commit

TokuDB offers high throughput for write intensive applications, and the throughput scales with the number of concurrent clients.  However, when the binary log is turned on, TokuDB 7.5.2 throughput suffers.  The throughput scaling problem is caused by a poor interaction between the binary log group commit algorithm in MySQL 5.6 and the way TokuDB commits transactions.   TokuDB 7.5.4 for Percona Server 5.6 fixes this problem, and the result is roughly an order of magnitude increase in SysBench throughput for in memory workloads.

MySQL uses two phase commit protocol to synchronize the MySQL binary log with the recovery logs of the storage engines when a transaction commits.  Since fsync’s are used to ensure the durability of the data in the various logs, and fsync’s can be very slow, the fsync can easily become a bottleneck.  A …

[Read more]
Benchmarking Presentation at Percona Live London 2014

In a few weeks I’m presenting “Performance Benchmarking: Tips, Tricks, and Lessons Learned” at Percona Live London 2014 (November 3-4). I continue to learn lessons and improve my benchmarking capabilities, so the content is a full upgrade from my presentation at Percona Live Santa Clara in April 2013. Anyone interested in achieving and sustaining the best performance out of their software/hardware/application should attend.

Also, Tokutek is sponsoring so we’ll be available in the expo hall throughout the show.

If you are attending or in the area and want to learn more about …

[Read more]
TokuDB Read Free Replication : Details and Use Cases

The biggest innovation in TokuDB v7.5 is Read Free Replication (RFR). I blogged a few days ago posting a benchmark showing how much additional throughput can be achieved on a replication slave, while at the same time lowering the read IO operations to almost zero. The official documentation on the feature is available here.

In this second blog I want to cover the requirements for RFR, as well as some interesting use-cases for the technology.

RFR Requirements The only requirement on the master is that …[Read more]
Showing entries 1 to 10 of 292
10 Older Entries »