Since we announced that TokuDB is now open source, there has been a lot of positive feedback (thanks!) and also some questions about the details. I want to take this opportunity to give a quick high level guide to describe what our repositories on Github are.
Here are the repositories:
Every few months, I get the fun job of announcing what’s new in TokuDB®, but this time is special. With Version 7, TokuDB for MySQL and MariaDB is going open source.
The free Community Edition is fully functional and fully performant. It has all the compression you’ve come to expect from TokuDB. It has hot schema changes: no-down-time column insertion, deletion, renaming, etc., as well as index creation. It has clustering secondary keys. We are also announcing an Enterprise Edition (coming soon) with additional benefits, such as a support package and advanced backup and recovery tools.
Making TokuDB open source is a natural next step for Tokutek’s involvement in the MySQL community. So far, Tokutek has been involved in the community in many ways:
If T.S. Eliot were a MySQL DBA, I think he would have been more upbeat about April.
We are gearing up for an incredible second half of April. We will be presenting three separate sessions at the Percona Live: MySQL Conference and Expo 2013, April 22-25, in Santa Clara, CA. In addition, we will be presenting at SkySQL’s MySQL & Cloud Database Solutions Day on Friday, April 26 at the same location.
Come by to see us in Booth #114, or stop by one of our sessions:
A little while ago I blogged about (and open sourced) an Impala-powered soccer visualization demo, designed to demonstrate just how responsive Impala queries can be. Since not everyone has the time or resources to run the project themselves, we’ve decided to host it ourselves on an EC2 instance. You can try the visualization; we’ve also opened up the Impala web interface, where you can see query profiles and performance numbers, and Hue (username and password are both ‘test’), where you can run your own queries on the dataset.
"While not comprehensive, the uses for NoSQL databases center around the acquisition of fast-growing data or data that does not easily fit within uniform structures."
During the second half of our CUBE discussion with Wikibon analyst Jeff Kelly at this year’s Strata Conference in Santa Clara, we talked about the tipping point for Big Data. Strata veterans could see at a glance that this year’s conference was markedly different. No longer the exclusive domain of geeks and database administrators, this year’s Strata featured some of the biggest enterprise vendors around. With heavy weight enterprise players Intel and EMC Greenplum announcing their own Hadoop distributions, big data is clearly going mainstream. Now that we know how to capture, store, access and analyze big data, what’s the next step? Listen in to hear my conversation with Jeff Kelly about taking big data[Read more...]
We had the opportunity to do a CUBE interview with Wikibon analyst Jeff Kelly at last week’s Strata Conference in Santa Clara. In the first part of our conversation, we discuss how our success in integrating Tokutek’s Fractal Tree® technology into MySQL has led us to another popular database, MongoDB. We explain the results of our recent benchmarking tests with MongoDB, which indicate that adding indexing can also improve performance for this popular NoSQL database with faster insertion rates, lower query latency and[Read more...]
With TokuDB v6.6 out now, I’m excited to present one of my favorite enhancements: fast updates with TokuDB. Update intensive applications can have their throughput limited by the random read capacity of the storage system. The cause of the throughput limit is the read-modify-write algorithm that MySQL uses when processing update statements. MySQL reads a row from the storage engine, applies the updates to it, and then writes the new row to the storage engine. To address this throughput limit, TokuDB uses a different update algorithm that simply encodes the update expressions of the SQL statement into tiny programs that are stored in an update Fractal Tree® message. This update message is[Read more...]
University of Montreal Tracks Genomic Data With Tokutek’s TokuDB.
Faster insertion rates, improved scalability and agility support lab’s fast growing research database as it grows from 100s of GBs to 1 TB and beyond.
Issue addressed: MySQL database used for genomic research must be able to quickly ingest huge amounts of incoming data – hundreds of thousands of records every day. It also must be able to retrieve data quickly in response to a diverse set of research requests.
We wanted to take a moment to say thanks to all of our customers and to the wider MySQL and MariaDB community. Today we announced a doubling of our customer base for the year ending December 31, 2012. Significant milestones over the last year included new technology and service partnerships, several awards, rapid hiring, as well as three upgrades to TokuDB®. We even dabbled in some MongoDB benchmarks. And to fuel continued growth in 2013, we secured additional venture capital funding last November.
Did You Hear? NASA Uses TokuDB for Big Data with MySQL!
To read the full press release and learn more,[Read more...]
TokuDB® is a proven solution that scales MySQL® and MariaDB® from GBs to TBs with unmatched insert and query speed, compression, replication performance and online schema flexibility. Tokutek’s recently launched TokuDB v6.6 delivers all of these features and more, with additional improvements in multi-client, fast SQL updates, and in-memory performance.
Date: January 15th
Time: 2 PM EST / 11 AM PST
Topics will include:
SAP’s HANA – a floor wax *and* a dessert topping?
For 451 Research clients: SAP’s HANA database – a floor wax *and* a dessert topping? bit.ly/13dmDCH
— Matt Aslett (@maslett) January 8, 2013
Attivio has secured $8 million in new growth funding from General Electric Pension Trust | bit.ly/ZwXPFG
— Attivio (@attivio) January 7, 2013
Why We Need To Kill “Big Data” | TechCrunch tcrn.ch/ZpbnDl
— Mortar (@mortardata) January 5,
What a flashback this week. Staring at a text terminal trying to establish a connection with a remote server, I began to fret whether I would get my homework assignment done on time. My mind raced back to college nights years ago in the Fishbowl, hunched over an Athena workstation. Would this be another late night fueled by Jolt cola in order to get my problem set done?
Embarking on my first software class in quite a while was relatively painless, and I have Sheeri Cabral and her detailed guidance to thank. This week I started the[Read more...]
Well, it’s that time of the year again for top ten lists. There have been many versions showing up on the web the last few days, including Time Magazine’s “Top 10 Everything of 2012″ list, with 55 wide ranging lists!
Last year we started using Google Analytics to see what content for blogs was most popular on Tokutek.com and generated a 2011 top ten list, ending up with a few surprises. This year saw spikes in some interesting areas as well, including flash performance, NASA and Big Data, and MongoDB.
Without further adieu, here is the top ten list for 2012:
10. Announcing TokuDB v6.1 –[Read more...]
We’ve all heard the term “Big Data” thrown around a fair amount in the last several years ever since the rise of Hadoop and other distributed storage methods. But defining “Big Data” has always been a subjective term that hinges on perspective; what one engineer considers big can be vastly different than another’s.
However, there’s finally a definite description that says Big Data no matter what perspective you operate from: “That facility by my calculations that I submitted to the court for the Electronic Frontiers Foundation against NSA would hold on the order of 5 zettabytes of data. Just that current storage capacity is being advertised on the web that you can buy. And that’s not talking about what they have in the near future.” You can read more about the facility and its purpose here:[Read more...]
There is obviously much being written these days about Big Data. While the term has many different meanings to many different folks, our MySQL and MariaDB customers tend to find their data to be uncomfortably big when the tables become too large for memory. In this case, more storage has to be acquired, performance starts to lag, and making changes to the schema becomes a challenge.[Read more...]
This webinar covers the basics of B-trees and Fractal Tree Indexes, the benchmarks we’ve run so far, and the development road map going forward.
Date: November 13th
Time: 2 PM EST / 11 AM PST
Topics will include:
We look forward to having you join the webinar. We also hope that by sharing these results with[Read more...]
I’ll be presenting “MongoDB and Fractal Tree Indexes” at MongoDB Boston 2012 on October 24th. My presentation covers the basics of B-trees and Fractal Tree Indexes, the benchmarks we’ve run so far, and the development road map going forward.
I’ve been to this one day conference twice now and both times came away with a better understanding of MongoDB’s capabilities, use-cases, and many questions answered via their deep technical dives. I highly recommend current MongoDB users and anyone considering a MongoDB project attend – it appears that seats are still available.
The core technology behind Tokutek is based on the academic research by our founders: Michael Bender, Bradley Kuszmaul and Martin Farach-Colton. They are all still in academia, in addition to their work at Tokutek.
Back in March, the White House kicked off a new Initiative for Big Data. Last week, the National Science Foundation announced the first interagency grants for this. Eight awards were given, and our own Michael Bender and Martin Farach-Colton, along with Robert Johnson of Stony Brook University, received one of[Read more...]