Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 25

Displaying posts with tag: compression (reset)

Understanding Tokutek Fractal Tree Indexes
+1 Vote Up -0Vote Down

Download PDF Presentation

Thanks to Tim Callaghan for speaking Tuesday night at the Effective MySQL New York meetup on Fractal Tree Indexes : Theory and Practice (MySQL and MongoDB). There was a good turnout and a full room to learn how the TokuDB storage engine from Tokutek is changing how to handle big data in MySQL.

Also interesting is how the same technology has been applied for use in MongoDB including giving MongoDB transactions; a big change for NoSQL.

Related News: Tokutek Meets Big Data Demand With Open Source TokuDB

Sysbench Benchmark for MongoDB
+1 Vote Up -0Vote Down

As we continue to test our Fractal Tree Indexing with MongoDB, I’ve been updating my benchmark infrastructure so I can compare performance, correctness, and resource utilization.  Sysbench has long been a standard for testing MySQL performance, so I created a version that is compatible with MongoDB.  You can grab my current version of Sysbench for MongoDB here.

So what exactly is Sysbench?  According to the Sysbench homepage, “Sysbench is a modular, cross-platform and multi-threaded benchmark tool for evaluating OS [Operating System] parameters that are important for a system running a database under intensive load.”

  • Sysbench schema
    • 16 copies of the same collection,

  [Read more...]
Tracking 5.3 Billion Mutations: Using MySQL for Genomic Big Data
+1 Vote Up -0Vote Down

University of Montreal Tracks Genomic Data With Tokutek’s TokuDB.

Faster insertion rates, improved scalability and agility support lab’s fast growing research database as it grows from 100s of GBs to 1 TB and beyond.

Issue addressed: MySQL database used for genomic research must be able to quickly ingest huge amounts of incoming data – hundreds of thousands of records every day. It also must be able to retrieve data quickly in response to a diverse set of research requests.

Enabling the Hunt for New Cures for Diseases by Seamlessly Processing Billions of Mutations  [Read more...]

Move over Marcia: Top Ten for 2012
+0 Vote Up -0Vote Down

Well, it’s that time of the year again for top ten lists. There have been many versions showing up on the web the last few days, including Time Magazine’s “Top 10 Everything of 2012″ list, with 55 wide ranging lists!

Last year we started using Google Analytics to see what content for blogs was most popular on Tokutek.com and generated a 2011 top ten list, ending up with a few surprises.  This year saw spikes in some interesting areas as well, including flash performance, NASA and Big Data, and MongoDB.

Without further adieu, here is the top ten list for 2012:

10. Announcing TokuDB v6.1

  [Read more...]
Webinar: Introduction to TokuDB v6.5
+1 Vote Up -0Vote Down

TokuDB® is a proven solution that scales MySQL® and MariaDB® from GBs to TBs with unmatched insert and query speed, compression, replication performance and online schema flexibility. Tokutek’s recently launched TokuDB v6.5 delivers all of these features and more, not just for HDDs, but also for flash memory.

Originally Aired: October 10th
AVAILABLE ON DEMAND

TokuDB v6.5:

  • Stores 10x More Data – TokuDB delivers 10x compression without any performance degradation. Users can therefore take advantage of much greater amounts of available space without paying more for additional storage.
  • Delivers High Insertion Speed – TokuDB

  [Read more...]
Helping to Reduce Page Compression Failures Rate
Employee_Team +1 Vote Up -0Vote Down

When InnoDB compresses a page it needs the result to fit into its predetermined compressed page size (specified with KEY_BLOCK_SIZE). When the result does not fit we call that a compression failure. In this case InnoDB needs to split up the page and try to compress again. That said, compression failures are bad for performance and should be minimized.

Whether the result of the compression will fit largely depends on the data being compressed and some tables and/or indexes may contain more compressible data than others. And so it would be nice if the compression failure rate, along with other compression stats, could be monitored on a per table or even on a per index basis, wouldn't it?

This is where the new INFORMATION_SCHEMA table in MySQL 5.6 kicks in. INFORMATION_SCHEMA.INNODB_CMP_PER_INDEX provides exactly this helpful information. It contains the



  [Read more...]
What compression do you use?
+4 Vote Up -0Vote Down

The following is an evaluation of various compression utilities that I tested when reviewing the various options for MySQL backup strategies. The overall winner in performance was pigz, a parallel implementation of gzip. If you use gzip today as most organizations do, this one change will improve your backup compression times.

Details of the test:

  • The database is 5.4GB of data
  • mysqldump produces a backup file of 2.9GB
  • The server is an AWS t1.xlarge with a dedicated EBS volume for backups

The following testing was performed to compare the time and % compression savings of various available open source products. This was not an exhaustive test with multiple iterations and different types of data files.

Compression
Utility Compression Time
(sec) Decompression Time
(sec) New Size
(% Saving) lzo



  [Read more...]
TokuDB v6.0: Even Better Compression
+1 Vote Up -0Vote Down

A key feature of our new TokuDB v6.0 release, which I have been blogging about this week, is compression. Compression is always on in TokuDB, and the compression we’ve achieved in the past has been quite good. See a previous post on the 18x compression achieved by TokuDB v5.0 on one benchmark. In our latest release, we’ve updated the way compression works and got 50% improvement on compression.

I decided to present numbers on the same set of data as the old post, so see that post for experimental details.

But first, what are the changes? TokuDB compresses large blocks

  [Read more...]
Announcing TokuDB v6.0: Less Slave Lag and More Compression
+1 Vote Up -0Vote Down

We are excited to announce TokuDB® v6.0, the latest version of Tokutek’s flagship storage engine for MySQL and MariaDB.

This version offers feature and performance enhancements over previous releases, support for XA (two-phase transactional commits), better compression, and reduced performance variability associated with checkpointing. This release also brings TokuDB support up to date on MySQL v5.1, MySQL v5.5 and MariaDB v5.2. There’s a lot of great technical stuff under the hood in this release and I’ll be reviewing the improvements one-by-one over the course of this week.

I’ll be posting more details about the new features and performance, so here’s an overview of what’s in store.

Replication Slave Lag One of the things TokuDB does well is single-threaded insertions, which translates directly into  [Read more...]
On InnoDB compression in production
+4 Vote Up -0Vote Down

Our latest changes have been pushed to public mysql@facebook branch, allowing this post to happen \o/

Recently we started rolling out InnoDB compression to our main database tier, and that has been a huge undertaking for multiple teams and a major test for MySQL. Nizam was sure the hero of all this work, and make sure you don’t miss his talk about it at MySQL conference.

Though MySQL manuals have quite some introduction about benefits of compression, we agree that benefits are good – in theory we can do less reads from disk, keep more data in buffer pool or

  [Read more...]
Evidenzia Upgrades to TokuDB v5.2 to Address Storage Growth and Scale Performance
+2 Vote Up -0Vote Down

Ensuring sufficient disk I/O to catch copyright violations at network speed.

Evidenzia GmbH & Co. KG

Issues addressed:

  • Storage growth, including maxed-out disk I/O utilization
  • Performance issues and business impact due to slow selects
  • Inability to revise data schema on the fly

The Company: Evidenzia GmbH & Co. KG is one of the leading partners of the software, movie and music industry when it comes to tracing copyright infringements

  [Read more...]
Announcing TokuDB v5.2: Improved Multi-Client Scaling and Faster Queries
+6 Vote Up -0Vote Down

TokuDB® v5.2, the latest version of Tokutek’s flagship storage engine for MySQL and MariaDB, is now available.

This version offers performance enhancements over previous releases, especially for multi-client scale up and point queries, and extends the cases where ALTER TABLE is non-blocking, in particular adding Hot Column Rename.

TokuDB v5.2 maintains all our established advantages: fast trickle load, fast bulk load, fast range queries through clustering indexes, hot schema changes, great compression, no fragmentation, and full MySQL compatibility for ease of installation. See our benchmark page for details.

Multi-client workloads

In TokuDB v5.2, we have reworked our locking scheme to better support multi-client workloads, and as

  [Read more...]
Are You Forcing MySQL to Do Twice as Many JOINs as Necessary?
+2 Vote Up -0Vote Down
.
Baron Schwartz This guest post is from our friends at Percona. They’re hosting Percona Live London from October 24-25, 2011. Percona Live is a two day summit with 100% technical sessions led by some of the most established speakers in the MySQL field.

In the London area and interested in attending? We are giving away two free passes in the next few days. Watch our @tokutek twitter feed for a chance to win.

Did you know that the following query actually performs a JOIN? You can’t see it, but it’s there:

SELECT the_day, COUNT(*), SUM(clicks),

  [Read more...]
Compression Benchmarking: Size vs. Speed (I want both)
+3 Vote Up -0Vote Down

I’m creating a library of benchmarks and test suites that will run as part of a Continuous Integration (CI) process here at Tokutek. My goal is to regularly measure several aspects of our storage engine over time: performance, correctness, memory/CPU/disk utilization, etc. I’ll also be running tests against InnoDB and other databases for comparative analysis. I plan on posting a series of blog entries as my CI framework evolves, for now I have the results of my first benchmark.

Compression is an always-on feature of TokuDB. There are no server/session variables to enable compression or change the compression level (one goal of TokuDB is to have as few tuning parameters as possible). My compression benchmark uses

  [Read more...]
Failure to compress
+1 Vote Up -0Vote Down

My team has been evaluating the performance of InnoDB compression. One of the workloads is to reload all tables of the database using concurrent connections. This is done with production data and via the insert benchmark. We also modified InnoDB to use qpress/quicklz as an alternative to zlib compression.

 

As I wrote this I realize that this is a great advertisement for a self-tuning DBMS. Alas, MySQL and InnoDB are not self-tuning when you have extreme performance requirements. Eventually they will move in that direction and for many users today they don't require this amount of tuning. Some of the parameters listed below are only in patches for MySQL from Facebook and others. Official MySQL tends to be better about making

  [Read more...]
When does InnoDB compress and decompress pages?
+1 Vote Up -0Vote Down

There are two sections for rows in the page format for InnoDB compressed tables. The compressed section has one or more rows and must be decompressed to access individual rows. The modification log has uncompressed rows and rows can be accessed without decompressing. The modification log is used to avoid decompressing and then possibly recompressing the compressed section on every row change. The buffer pool also has separate uncompressed copies of some pages so that every row read does not require a page decompression.

 

I want to understand when a page must be decompressed or recompressed. This is definitely an incomplete list.

  • A page is decompressed when a row is read and the uncompressed version of the page is not in the buffer pool.
  • I think a row can be
  [Read more...]
How does InnoDB manage the LRU for compressed pages?
+1 Vote Up -0Vote Down

InnoDB uses at least two lists to manage pages in the buffer pool. The LRU is used for pages from compressed and uncompressed tables. If a table is uncompressed then it only uses the LRU. If a table is compressed then compressed pages are on the LRU and the unzip_LRU is used for pages that have an uncompressed and compressed version in the buffer pool. When a server is IO bound then InnoDB allows the unzip_LRU to be 10% the size of the LRU. SHOW INNODB STATUS displays the length of the LRU and unzip_LRU (grep it for "unzip_LRU"). Note that I use the word page in some places when frame might be

  [Read more...]
A few notes on InnoDB PRIMARY KEY
+0 Vote Up -1Vote Down
InnoDB uses an index-organized data storage technique, wherein the primary key acts as the clustered index and this clustered index holds the data. Its for this reason that understanding the basics of InnoDB primary key is very important, and hence the need for these notes.
MySQL 5.5: Improved manageability, efficiency for InnoDB
Employee_Team +2 Vote Up -0Vote Down
In my continuing blog series on MySQL 5.5 features (see performance/scale and replication entries) today I covering some of the new InnoDB manageability and efficiency options.  5.5, with the newly re-architected InnoDB, provides better user control over internal InnoDB settings so things like performance, scale and storage can easily be monitored, tuned and optimized for specific use cases and application loads.

Along these lines, some of the key advances and features available in MySQL 5.5 and InnoDB are:
 
  • Faster Index Creation - MySQL 5.5 can now add or drop indexes without copying the underlying data of the entire




  [Read more...]
MySQL 5.5: Improved manageability, efficiency for InnoDB
Employee_Team +0 Vote Up -0Vote Down
In my continuing blog series on MySQL 5.5 features (see performance/scale and replication entries) today I covering some of the new InnoDB manageability and efficiency options.  5.5, with the newly re-architected InnoDB, provides better user control over internal InnoDB settings so things like performance, scale and storage can easily be monitored, tuned and optimized for specific use cases and application loads.

Along these lines, some of the key advances and features available in MySQL 5.5 and InnoDB are:
 
  • Faster Index Creation - MySQL 5.5 can now add or drop indexes without copying the underlying data of the entire target table. 




  [Read more...]
ZFS and Solaris: storage optimization for the cloud
Employee +1 Vote Up -0Vote Down

Cloud computing has been one of the most discussed topic over the year, and the discussion is not over because what is really being discussed is the way we will access computing an storage resources in the future. Even famous French intellectuals are giving their opinion and making predictions. Future will decide on predictions's accuracy.

What is usually less discussed is the technology behind cloud-computing, though this is no secret that virtualization is playing a key role. Cloud data-centers will be loaded with virtual machines each of these machines potentially requiring in disk-space what a complete operating system (OS) requires, which can go up to many gigabytes. How much disk-space does a virtual machine image (vdi) really consum? The only good answer is: too much.

  [Read more...]
ZFS and Solaris: storage optimization for the cloud
Employee +0 Vote Up -0Vote Down

Cloud computing has been one of the most discussed topic over the year, and the discussion is not over because what is really being discussed is the way we will access computing an storage resources in the future. Even famous French intellectuals are giving their opinion and making predictions. Future will decide on predictions's accuracy.

What is usually less discussed is the technology behind cloud-computing, though this is no secret that virtualization is playing a key role. Cloud data-centers will be loaded with virtual machines each of these machines potentially requiring in disk-space what a complete operating system (OS) requires, which can go up to many gigabytes. How much disk-space does a virtual machine image (vdi) really consum? The only good answer is: too

  [Read more...]
ZFS and Solaris: storage optimization for the cloud
Employee +0 Vote Up -0Vote Down

Cloud computing has been one of the most discussed topic over the year, and the discussion is not over because what is really being discussed is the way we will access computing an storage resources in the future. Even famous French intellectuals are giving their opinion and making predictions. Future will decide on predictions's accuracy.

What is usually less discussed is the technology behind cloud-computing, though this is no secret that virtualization is playing a key role. Cloud data-centers will be loaded with virtual machines each of these machines potentially requiring in disk-space what a complete operating system (OS) requires, which can go up to many gigabytes. How much disk-space does a virtual machine image (vdi) really consum? The only good answer is: too

  [Read more...]
ZFS & MySQL/InnoDB Compression Update
+0 Vote Up -0Vote Down

Network.com setup in Vegas, Thumper disk bay, green by Shawn Ferry

As I expected it would, the fact that I used ZFS compression on our MySQL volume in my little OpenSolaris experiment struck a chord in the comments. I chose gzip-9 for our first pass for a few reasons:

  • I wanted to see what the “best case” compression ratio was for our dataset (InnoDB tables)
  • I wanted to see what the “worst case” CPU usage was for our workload
  • I don’t have a lot of time. I need to try something quick & dirty.
  • I got both

      [Read more...]
    Success with OpenSolaris + ZFS + MySQL in production!
    +0 Vote Up -0Vote Down

    Pimp My Drive by Richard and Barb

    There’s remarkably little information online about using MySQL on ZFS, successfully or not, so I did what any enterprising geek would do: Built a box, threw some data on it, and tossed it into production to see if it would sink or swim.

    I’m a Linux geek, have been since 1993 (Slackware!). All of SmugMug’s datacenters (and

      [Read more...]
    Showing entries 1 to 25

    Planet MySQL © 1995, 2013, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

    Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.