Showing entries 21 to 30 of 44
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: compression (reset)
How TokuMX Gets Great Compression for MongoDB

In my last post, I showed what a Fractal Tree® index is at a high level. Once again, the Fractal Tree index is the data structure inside TokuMX and TokuDB, our MongoDB and MySQL products. One of its strengths is the ability to get high levels of compression on the stored data. In this post, I’ll explain why that is.

At a high level, one can argue that there isn’t anything special about our compression algorithms. We basically do this: we take large chunks of data, use known compression methods (e.g. zlib, lzma, …

[Read more]
Announcing TokuMX v1.0: Toku+Mongo = You Can Have It All

Tokutek is known for its full-featured fast-indexing technology. MongoDB is known for its great document-based data model and ease of use. TokuMX, version 1.0, combines the best of both worlds.

  • So what, exactly, is TokuMX? The simplest (but incomplete) answer is that TokuMX is MongoDB with all its storage code replaced by Tokutek’s Fractal Tree indexes.
  • How do Fractal Tree indexes improve MongoDB? The direct benefits include high-performance indexing, strong compression, and performance stability – in other words, the performance stays high, even when data is larger than RAM.
  • Are there any features in TokuMX that MongoDB doesn’t have? Yes. We have added support for transactions to TokuMX, so that TokuMX is ACID compliant and has MVCC. We have also added support for clustering indexes, which dramatically accelerate many types of queries.
[Read more]
Understanding Tokutek Fractal Tree Indexes

Download PDF Presentation

Thanks to Tim Callaghan for speaking Tuesday night at the Effective MySQL New York meetup on Fractal Tree Indexes : Theory and Practice (MySQL and MongoDB). There was a good turnout and a full room to learn how the TokuDB storage engine from Tokutek is changing how to handle big data in MySQL.

Also interesting is how the same technology has been applied for use in MongoDB including giving MongoDB transactions; a big change for NoSQL.

Related News: …

[Read more]
Sysbench Benchmark for MongoDB

As we continue to test our Fractal Tree Indexing with MongoDB, I’ve been updating my benchmark infrastructure so I can compare performance, correctness, and resource utilization.  Sysbench has long been a standard for testing MySQL performance, so I created a version that is compatible with MongoDB.  You can grab my current version of Sysbench for MongoDB here.

So what exactly is Sysbench?  According to the Sysbench homepage, “Sysbench is a modular, cross-platform and multi-threaded benchmark tool for evaluating OS [Operating System] parameters that are important for a system running a database under intensive load.”

  • Sysbench schema
    • 16 copies of the same collection, named sbtest1 … sbtest16, each with 10 …
[Read more]
Tracking 5.3 Billion Mutations: Using MySQL for Genomic Big Data

University of Montreal Tracks Genomic Data With Tokutek’s TokuDB.

Faster insertion rates, improved scalability and agility support lab’s fast growing research database as it grows from 100s of GBs to 1 TB and beyond.

Issue addressed: MySQL database used for genomic research must be able to quickly ingest huge amounts of incoming data – hundreds of thousands of records every day. It also must be able to retrieve data quickly in response to a diverse set of research requests.

Enabling the Hunt for New Cures for Diseases by Seamlessly Processing Billions of Mutations in Epidemiology Records

The Organization: The The Philip Awadalla Laboratory is the Medical and …

[Read more]
Move over Marcia: Top Ten for 2012

Well, it’s that time of the year again for top ten lists. There have been many versions showing up on the web the last few days, including Time Magazine’s “Top 10 Everything of 2012″ list, with 55 wide ranging lists!

Last year we started using Google Analytics to see what content for blogs was most popular on and generated a 2011 top ten list, ending up with a few surprises.  This year saw spikes in some interesting areas as well, including flash performance, NASA and Big Data, and MongoDB.

Without further adieu, here is the top ten list for 2012:

10. Announcing TokuDB v6.1 – This release included better overall performance and brought …

[Read more]
Webinar: Introduction to TokuDB v6.5

TokuDB® is a proven solution that scales MySQL® and MariaDB® from GBs to TBs with unmatched insert and query speed, compression, replication performance and online schema flexibility. Tokutek’s recently launched TokuDB v6.5 delivers all of these features and more, not just for HDDs, but also for flash memory.

Originally Aired: October 10th

TokuDB v6.5:

  • Stores 10x More Data – TokuDB delivers 10x compression without any performance degradation. Users can therefore take advantage of much greater amounts of available space without paying more for additional storage.
  • Delivers High Insertion Speed – TokuDB Fractal Tree® indexes …
[Read more]
Helping to Reduce Page Compression Failures Rate

When InnoDB compresses a page it needs the result to fit into its predetermined compressed page size (specified with KEY_BLOCK_SIZE). When the result does not fit we call that a compression failure. In this case InnoDB needs to split up the page and try to compress again. That said, compression failures are bad for performance and should be minimized.

Whether the result of the compression will fit largely depends on the data being compressed and some tables and/or indexes may contain more compressible data than others. And so it would be nice if the compression failure rate, along with other compression stats, could be monitored on a per table or even on a per index basis, wouldn't it?

This is where the new INFORMATION_SCHEMA table in MySQL 5.6 kicks in. INFORMATION_SCHEMA.INNODB_CMP_PER_INDEX provides exactly this helpful information. It contains the following fields:

[Read more]
What compression do you use?

The following is an evaluation of various compression utilities that I tested when reviewing the various options for MySQL backup strategies. The overall winner in performance was pigz, a parallel implementation of gzip. If you use gzip today as most organizations do, this one change will improve your backup compression times.

Details of the test:

  • The database is 5.4GB of data
  • mysqldump produces a backup file of 2.9GB
  • The server is an AWS t1.xlarge with a dedicated EBS volume for backups

The following testing was performed to compare the time and % compression savings of various available open source products. This was not an exhaustive test with multiple iterations and different types of data files.

Compression Time
Decompression Time
[Read more]
TokuDB v6.0: Even Better Compression

A key feature of our new TokuDB v6.0 release, which I have been blogging about this week, is compression. Compression is always on in TokuDB, and the compression we’ve achieved in the past has been quite good. See a previous post on the 18x compression achieved by TokuDB v5.0 on one benchmark. In our latest release, we’ve updated the way compression works and got 50% improvement on compression.

I decided to present numbers on the same set of data as the old post, so see that post for experimental details.

But first, what are the changes? TokuDB compresses large blocks of data — on the order of MB, rather than the 16KB that InnoDB uses — …

[Read more]
Showing entries 21 to 30 of 44
« 10 Newer Entries | 10 Older Entries »