Two weeks ago Percona announced it's acquisition of Tokutek
(April 14, 2015). The analyst coverage was a bit fluffy for my
liking, but I decided to give it some time and see if anything
"meaty" would come along, and ... it hasn't. The sheer number of
tweets on Twitter was impressive, which makes
me hopeful that the acquisition raised awareness to the Tokutek
technologies and that the Tokutek products have a found a good
home
I've been thinking a lot about the future of the Tokutek
technologies over these same two weeks and want to share them
publicly. I'm going to cover TokuDB in this blog post, TokuMX in
a few days, and finally Fractal Tree Indexes a few days later.
[Full disclosure: I worked at Tokutek for 3.5 years (08/2011 -
01/2015) as VP/Engineering and I do not have any equity in
Tokutek or Percona]
Thoughts on Percona + TokuDB
Integration and Ease of Use
- Percona will certainly spend the time to make using the TokuDB storage engine in Percona Server as easy and foolproof as possible. Prior to the acquisition, users needed to download and install an additional package to use the TokuDB storage engine in Percona Server (this was not the case for those downloading directly from the Tokutek web site or downloading from MariaDB). I hope that TokuDB becomes part of the base Percona Server package and that the plugin is installed by default. At that point the process of trying TokuDB is as easy as adding "engine=TokuDB" in a CREATE TABLE statement.
- Memory usage (cache). InnoDB supports a user defined cache size, as does TokuDB. Users often allocate more than 50% to their cache to their storage engine and can easily over allocate the server's memory if using both engines. I'm not sure what the ideal solution is for this problem. I think InnoDB supports a dynamic cache sizing in MySQL 5.7, perhaps adding this feature to TokuDB and automatically changing the [over]allocation would work.
Foreign Keys
- Had MySQL implemented foreign key constraints above the storage engine this would have been a non-issue, but it didn't. They are implemented within InnoDB. Will foreign keys ever come to TokuDB? I'd argue that it's not worth the effort, and users needing foreign keys can always use InnoDB for those specific tables. But lack of foreign keys certainly complicates the user's experience.
Files, Files, Files
- Most people use InnoDB's file-per-table option, meaning a single file is creating in the file system for each table (I'm not going to count .frm files). In contrast, TokuDB creates 2 files for a table, and another file for each secondary index. A great benefit of this approach is that dropping an index is instantaneous, and all the space for that index is returned immediately. The downside is the sheer number of files, especially if you have a large number of tables. And a lot more files if you partition your tables (a full set of the before mentioned file for each partition).
- All TokuDB files are kept in the root of the data folder (or they can be put in a single TokuDB defined data directory). Moving the files to the individual database folders would be a nice feature.
Hot Backup vs. XtraBackup
- Creating an online backup of TokuDB is significantly different than performing the same operation of InnoDB. Percona created XtraBackup to simplify the backup process for InnoDB and it is now a feature-rich backup technology (full backups, incremental backups, etc). XtraBackup does not work on TokuDB tables.
- TokuDB's hot backup feature is enterprise edition only (it is the paid feature), closed source, and only supports the creation of a full backup. It does work on InnoDB tables, as long as asynchronous IO is not enabled.
- What does the future hold? It would be great to see TokuDB's hot backup functionality merged into XtraBackup so a single backup technology existed that "just worked" for both storage engines.
- At the moment it feels weird that Percona owns/offers a closed source technology, open sourcing the TokuDB hot backup would be nice to see.
Instrumentation and Utilities
- Percona Server is well known for the additional instrumentation it provides, it would be awesome if this operational "tooling" could also be applied to the TokuDB storage engine internals and exposed easy consumption.
- It will also be interesting to see if TokuDB gets more attention in Percona's cloud tools. Percona could collect and analyze information from servers using InnoDB and make the recommendation that TokuDB be adopted by analyzing the user's workload.
- As Percona provides support to TokuDB customers and gathers feedback from the TokuDB community there will likely be features and new utilities added to the Percona Toolkit.
Native Partitioning
- InnoDB is adding native partitioning in MySQL 5.7. Partitioning is currently handled by what is essentially a "storage engine", which is pretty cool. A big downside to this implementation is that queries needing data from multiple partitions query each partition in order, and can take a long time when the number of partitions is large. I assume that InnoDB's long term plans for native partitioning is to support concurrent queries on multiple partitions, we shall see. Percona will need to invest in TokuDB to bring native partitioning to it as well.
MySQL and MariaDB Support
- Will Percona assist MariaDB with the engineering/QA/packaging of TokuDB?
- Will Percona offer a MySQL version of TokuDB as Tokutek has in the past?
- Time will tell.
Human Resources
- Percona has been performing MySQL related engineering for quite a while now, but TokuDB is not exactly the same effort as XtraDB and XtraBackup. I'm curious to see how much they are looking to grow the team after adding TokuDB to their product list. They've already posted on their jobs page for "C/C++ Developers for TokuDB, TokuMX, and Tokutek Products".
- Adding Percona based support for TokuDB is a huge win for current and future TokuDB customers.
- Percona consulting will quickly learn the best workloads for TokuDB which should grow the user base (both paid and community).
- I'm excited about all of these possibilities for TokuDB.
I will probably come up with more thoughts over time, but this
feels like a good place to stop for now. I'll post my TokuMX and
TokuMXse thoughts in a few days.
Please asks questions or comment below.