I’ve been considering using TokuDB for a large dataset, primarily because of its high compression. The data is append-only, never updated, rarely read, and purged after a configurable time.
I use partitions to drop old data a day at a time. It’s much more efficient than deleting rows, and it lets me avoid indexing the data on the time dimension. Partitioning serves as a crude form of indexing, as well as helping purge old data.
I wondered if TokuDB supports partitioning. Then I remembered some older posts from the Tokutek blog about partitioning. The claim is that “there are almost always better (higher performing, more robust, lower maintenance) alternatives to partitioning.”
I’m not sure this is true for my use case, for a couple of reasons.
First, I clearly fall into the only category that the flowchart …
[Read more]