Last week I wrote up my thoughts about the Percona acquisition of
Tokutek from the perspective of TokuDB and TokuMX[se]. In this third blog of the trilogy
I'll cover the acquisition and the future of the Fractal Tree
Index. The Fractal Tree Index is the foundational technology upon
which all Tokutek products are built.
So what is a Fractal Tree Index? To quote the Wikipedia page:
"a Fractal Tree index is a tree data structure that keeps data
sorted and allows searches and …
An idea for a benchmark based on the “arrival request” rate that I wrote about in a post headlined “Introducing new type of benchmark” back in 2012 was implemented in Sysbench. However, Sysbench provides only a simple workload, so to be able to compare InnoDB with TokuDB, and later MongoDB with Percona TokuMX, I wanted to use more complicated scenarios. (Both TokuDB and TokuMX are part of Percona’s product line, in the case you missed Tokutek now part of the Percona family.)
Thanks to Facebook – they provide LinkBench, a benchmark that emulates the social graph …
[Read more]
A few days ago I wrote up my thoughts about the Percona acquisition of
Tokutek with respect to TokuDB. In this blog I'm going to do
the same for TokuMX and TokuMXse. And in a few days I'll wrap up
this trilogy by sharing my thoughts about Fractal Tree
Indexes.
Again, when I'm writing up something that I was very involved
with in the past I think it's important to disclose that I worked
at Tokutek for 3.5 years (08/2011 - 01/2015) as VP/Engineering
and I do not have any equity in Tokutek or Percona.
Since much of the MySQL crowd might be hearing about Tokutek's
"other products" for the first time I'll provide a little history
of both of the products before I dive in deeper.
TokuMX is a fork of MongoDB …
It is my pleasure to announce that Percona has acquired Tokutek and will take over development and support for TokuDB® and TokuMX™ as well as the revolutionary Fractal Tree® indexing technology that enables those products to deliver improved performance, reliability and compression for modern Big Data applications.
At Percona we have been working with the Tokutek team since 2009, helping to improve performance and scalability. The TokuDB storage engine has been available for Percona Server for about a year, so joining forces is quite a natural step for us.
Fractal Tree indexing technology—developed by years of data science research at MIT, Stony Brook University and Rutgers University—is the new generation data structure which, for many workloads, leapfrogs traditional B-tree technology which was invented in 1972 (over 40 years ago!). It is also often superior to LSM indexing, especially for mixed workloads.
…
[Read more]In Mo’ Data, Mo’ Problems, we explored the paradox that “Big Data” projects pose to organizations and how Tokutek is taking an innovative approach to solving those problems. In this post, we’re going to talk about another hot topic in IT, “The Cloud,” and how enterprises undertaking Cloud efforts often struggle with idea of “problem trading.” Also, for some reason, databases are just given a pass as traditionally “noisy neighbors” and that there is nothing that can be done about it. Lets take a look at why we disagree.
With the birth of the information age came a coupling of business and IT. Increasingly strategic business projects and objectives were reliant on information infrastructure to provide information storage and retrieval instead of paper and filing cabinets. This was the dawn of the database and what gave rise to companies like Oracle, Sybase and MySQL. With the appearance of true Enterprise Grade …
[Read more]
There are generally three components to any benchmark
project:
- Create the benchmark application
- Execute it
- Publish your results
I assume many people think they want to run more benchmarks but
give up since step 2 is extremely consuming as you expand the
number of different configurations/scenarios.
I'm hoping that this blog post will encourage more people to
dive-in and participate, as I'll be sharing the bash script I
used to test the various compression options coming in the MongoDB
3.0 storage engines. It enabled me to run a few different
tests against 8 different configurations, recording insertion
speed and size-on-disk for each one.
If you're into this sort of thing, please read on and provide any
feedback or improvements you can think of. …
Today is my last day at Tokutek. On Monday I'm starting a new opportunity
as VP/Technology at CrunchTime!. If you are a web developer, database
developer, or quality assurance engineer in the Boston area and
looking for a new opportunity please contact me or visit the
CrunchTime! career page.
I've really enjoyed my time at VoltDB and Tokutek. Working for
Mike Stonebraker (at VoltDB) was on my career
"bucket list" and in these past 3.5 years at Tokutek I've
experienced the awesomeness of the MySQL ecosystem and the
surging NoSQL database …
Welcome to blog #2 in a series about the benefits of the Fractal Tree. In this post, I’ll be explaining Big Data, why it poses such a problem and how Tokutek can help. Given the fact that I am a lifelong fan of both Hip-hop and Big Data, the title was a no-brainer and, given the artist, a bit of a pun.
I am as tired as you of hearing the term “Big Data.” It’s so overused, that it ceases to have specific meaning anymore. You see, data hardly ever starts as “big” or a “problem.” Rather, it starts small and easily manageable, but gradually grows to some unimaginable size and becomes a beast in need of slaying, like the irradiated ant from a sci-fi film, growing to the size of a cruise ship. The nature of tackling such a tough problem means that the initial understanding of the factors involved is, oftentimes, incomplete at best; Catch-22 exemplified. During the course of problem it is …
[Read more]In my recent travels, I’ve been speaking with database users at various meetups and trade shows worldwide. Very often, I got questions centering around the best use cases for our products, be it TokuDB, our MySQL storage engine, or, TokuMX, our distribution of MongoDB. Over 90% of the time, I responded Cloud, Big Data or both. You see, in the software industry we’re like kindergartners, we like things to fit into neat categories. If you know any software sales people, you’ll recognize this as a fitting analogy (at least in terms of energy and attention span), but I digress. This strategy helps allocate resources where they are most likely to make an impact, and, thus, optimize our return on investment. In this blog series, I’m going to go slightly against the grain and explain why the Fractal Tree makes databases work in these environments.
Before I get into more detail, let me …
[Read more]Last month I wrote a blog about the closing of MongoDB ticket SERVER-1240, which brings Collection Level Locking (CLL) to the MMAPV1 storage engine in MongoDB 2.8. In MongoDB 2.6 there is a writer lock at the database level, so each database only allows one writer at a time. In concurrent write workloads, this means that all writers essentially form a single line and do their writes one at a time. In MongoDB 2.8 this lock has been moved to the collection level. Better yet is document level locking, but even though this feature was shown at MongoDB World 2014 it's not going to ship. But it did make for one amazing demo by …
[Read more]