The “Big Data” buzzword finally gets a real definition

We’ve all heard the term “Big Data” thrown around a fair amount in the last several years ever since the rise of Hadoop and other distributed storage methods. But defining “Big Data” has always been a subjective term that hinges on perspective; what one engineer considers big can be vastly different than another’s.

However, there’s finally a definite description that says Big Data no matter what perspective you operate from: “That facility by my calculations that I submitted to the court for the Electronic Frontiers Foundation against NSA would hold on the order of 5 zettabytes of data. Just that current storage capacity is being advertised on the web that you can buy. And that’s not talking about what they have in the near future.” You can read more about the facility and its purpose here: …

On Big Data, Analytics and Hadoop. Interview with Daniel Abadi.

“Some people even think that “Hadoop” and “Big Data” are synonymous (though this is an over-characterization). Unfortunately, Hadoop was designed based on a paper by Google in 2004 which was focused on use cases involving unstructured data (e.g. extracting words and phrases from Webpages in order to create Google’s Web index). Since it was not [...]

Small Data

There is obviously much being written these days about Big Data. While the term has many different meanings to many different folks, our MySQL and MariaDB customers tend to find their data to be uncomfortably big when the tables become too large for memory. In this case, more storage has to be acquired, performance starts to lag, and making changes to the schema becomes a challenge.

TokuDB addresses these issues for big MySQL instances by delivering high compression rates, faster insertion and query performance, and agile …

Two Cons against NoSQL. Part I.

Two cons against NoSQL data stores read like this: 1. It’s very hard to move data out from one NoSQL to some other system, even other NoSQL. There is a very hard lock in when it comes to NoSQL. If you ever have to move to another database, you have basically to re-implement a lot [...]

Webinar: MongoDB and Fractal Tree Indexes

This webinar covers the basics of B-trees and Fractal Tree Indexes, the benchmarks we’ve run so far, and the development road map going forward.

Date: November 13th
Time: 2 PM EST / 11 AM PST

Topics will include:

  • What is a Fractal Tree Index?
  • How to Fractal Trees compare with B-Trees
  • What can a Fractal Tree do for MongoDB performance
  • Benchmarks + Gotchas
  • What’s next

We look forward to having you join the webinar. We also hope that by sharing these results with the community we will be able to elicit people’s …

On Eventual Consistency– Interview with Monty Widenius.

“For analytical things, eventual consistency is ok (as long as you can know after you have run them if they were consistent or not). For real world involving money or resources it’s not necessarily the case.” — Michael “Monty” Widenius. In a recent interview, I asked Justin Sheehy, Chief Technology Officer at Basho Technologies, maker [...]

Presenting “MongoDB and Fractal Tree Indexes” at MongoDB Boston 2012

I’ll be presenting “MongoDB and Fractal Tree Indexes” at MongoDB Boston 2012 on October 24th.  My presentation covers the basics of B-trees and Fractal Tree Indexes, the benchmarks we’ve run so far, and the development road map going forward.

I’ve been to this one day conference twice now and both times came away with a better understanding of MongoDB’s capabilities, use-cases, and many questions answered via their deep technical dives.  I highly recommend current MongoDB users and anyone considering a MongoDB project attend – it appears that seats are still available.

First of New NSF Big Data Grants Go to Tokutek Founders

The core technology behind Tokutek is based on the academic research by our founders: Michael Bender, Bradley Kuszmaul and Martin Farach-Colton.  They are all still in academia, in addition to their work at Tokutek.

Back in March, the White House kicked off a new Initiative for Big Data.  Last week, the National Science Foundation announced the first interagency grants for this.  Eight awards were given, and our own Michael Bender and Martin Farach-Colton, along with Robert Johnson of Stony Brook University, received one of them.

Through their academic work, they hope to extend our …

Scaling MySQL and MariaDB to TBs: Interview with Martín Farach-Colton.

“While I believe that one size fits most, claims that RDBMS can no longer keep up with modern workloads come in from all directions. When people talk about performance of databases on large systems, the root cause of their concerns is often the performance of the underlying B-tree index”– Martín Farach-Colton. Scaling MySQL and MariaDB [...]

Log Buffer #289, A Carnival of the Vanities for DBAs

Oracle Open World 2012, this year, was all about Cloud, 12c, Exadata, Fusion, SuperClusters, social media, content management and much more. From operating systems to databases, and from applications to interactive media, professionals all around the world presented, attended, and networked in San Francisco. MySQL’S professionals also rocked massively. SQL Server bloggers also remained actively [...]

