Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 30 of 33 Next 3 Older Entries

Displaying posts with tag: compression (reset)

MySQL Enterprise Backup 3.10: Teasing compression.
Employee +0 Vote Up -0Vote Down

Ok, so I wanted to look into the new compression options of MEB 3.10.

And I would like to share my tests with you. Remember, they’re just this, tests, so please feel free to copy n paste and obtain your own results and conclusions, and should I say it, baselines, in order to compare future behaviour, on your own system.

An Oracle Linux 6.3 virtual machine with 3Gb RAM, 2 virtual threads, on a 1x quad core, windows laptop. Not pretty, but hey.

So, these tests are solely about backup. I’ll do restore when I get some *more* time.

 

First up, lets compare like with like, i.e. MEB version 3.9 & 3.10:

Let’s make this interesting, hence, want to use as much resources available as possible, read, write, process threads and number of buffers.

mysqlbackup --user=root --password=oracle
  [Read more...]
MySQL Enterprise Backup 3.10: Teasing compression.
Employee +0 Vote Up -0Vote Down

Ok, so I wanted to look into the new compression options of MEB 3.10.

And I would like to share my tests with you. Remember, they’re just this, tests, so please feel free to copy n paste and obtain your own results and conclusions, and should I say it, baselines, in order to compare future behaviour, on your own system.

An Oracle Linux 6.3 virtual machine with 3Gb RAM, 2 virtual threads, on a 1x quad core, windows laptop. Not pretty, but hey.

So, these tests are solely about backup. I’ll do restore when I get some *more* time.

 

First up, lets compare like with like, i.e. MEB version 3.9 & 3.10:

Let’s make this interesting, hence, want to use as much resources available as possible, read, write, process threads and number of buffers.

mysqlbackup --user=root --password=oracle
  [Read more...]
InnoDB Transparent PageIO Compression
Employee_Team +3 Vote Up -0Vote Down

We have released some code in a labs release that does compression at the InnoDB IO layer. Let me answer the most frequently asked question. It will work on any OS/File system that supports sparse files and has “punch hole” support. It is not specific to FusionIO. However, I’ve been told by the FusionIO developers that you will get two benefits from FusionIO + NVMFS, no fragmenation issues and more space savings because of a smaller file system block size. Why the block size matters I will attempt to explain next.

The high level idea is rather simple. Given a 16K page we compress it using your favorite compression algorithm and write out the only the compressed data. After writing out the data we “punch a hole” to release the unused part of the original 16K block back to the file system. Let me illustrate with an example:

[DDDDDDDDDDDDDDDD]

  [Read more...]
Significant performance boost with new MariaDB page compression on FusionIO
+0 Vote Up -0Vote Down

The MariaDB project is pleased to announce a special preview release of MariaDB 10.0.9 with significant performance gains on FusionIO devices. This is is a beta-quality preview release.

Download MariaDB 10.0.9-FusionIO preview

Background

The latest work between MariaDB and FusionIO has focused on dramatically improving performance of MariaDB on the high-end SSD drives produced by Fusion-IO and at the same time delivering much better endurance for the drives themselves. Furthermore, FusionIO flash memory solutions increase transactional database performance. MariaDB includes specialized improvements for FusionIO devices, leveraging a feature of the NVMFS filesystem on these popular, high performance solid state disks. Using this feature, MariaDB 10 can

  [Read more...]
MariaDB world record price per row 0.0000005$ on a single DELL R710
+0 Vote Up -0Vote Down
Don't look at an industry benchmark here, it's a real client story.

200 Billion records in a month and it should be transactional but not durable.

For regular workload we use LOAD DATA INFILE into partitioned InnoDB, but here we have estimated 15TB of RAID storage. This is a lot of disks and it can't no more stay inside a single server internal storage.

MariaDB 5.5 come with TokuDB storage engine for compression, but is it possible in the time frame impose by the workload?

We start benchmarking 380G of raw input data files,  6 Billion rows.

First let's check the compression with the dataset.













  [Read more...]
Converting an OLAP database to TokuDB, part 1
+0 Vote Up -0Vote Down

This is the first in a series of posts describing my impressions of converting a large OLAP server to TokuDB. There's a lot to tell, and the experiment is not yet complete, so this is an ongoing blogging. In this post I will describe the case at hand and out initial reasons for looking at TokuDB.

Disclosure: I have no personal interests and no company interests; we did get friendly, useful and free advice from Tokutek engineers. TokuDB is open source and free to use, though commercial license is also available.

The case at hand

We have a large and fast growing DWH MySQL setup. This data warehouse is but one component in a larger data setup, which includes Hadoop, Cassandra and more. For online dashboards and most reports, MySQL is our service. We populate this warehouse mainly via Hive/Hadoop. Thus, we have an hourly load of data from Hive, as

  [Read more...]
How TokuMX Gets Great Compression for MongoDB
+0 Vote Up -0Vote Down

In my last post, I showed what a Fractal Tree® index is at a high level. Once again, the Fractal Tree index is the data structure inside TokuMX and TokuDB, our MongoDB and MySQL products. One of its strengths is the ability to get high levels of compression on the stored data. In this post, I’ll explain why that is.

At a high level, one can argue that there isn’t anything special about our compression algorithms. We basically do this: we take large chunks of data, use known compression methods (e.g. zlib,

  [Read more...]
Announcing TokuMX v1.0: Toku+Mongo = You Can Have It All
+5 Vote Up -0Vote Down

Tokutek is known for its full-featured fast-indexing technology. MongoDB is known for its great document-based data model and ease of use. TokuMX, version 1.0, combines the best of both worlds.

  • So what, exactly, is TokuMX? The simplest (but incomplete) answer is that TokuMX is MongoDB with all its storage code replaced by Tokutek’s Fractal Tree indexes.
  • How do Fractal Tree indexes improve MongoDB? The direct benefits include high-performance indexing, strong compression, and performance stability – in other words, the performance stays high, even when data is larger than RAM.
  • Are there any features in TokuMX that MongoDB doesn’t have? Yes. We have added support for transactions to TokuMX, so that TokuMX is ACID compliant and has MVCC. We have also added support for clustering indexes, which
  [Read more...]
Understanding Tokutek Fractal Tree Indexes
+1 Vote Up -0Vote Down

Download PDF Presentation

Thanks to Tim Callaghan for speaking Tuesday night at the Effective MySQL New York meetup on Fractal Tree Indexes : Theory and Practice (MySQL and MongoDB). There was a good turnout and a full room to learn how the TokuDB storage engine from Tokutek is changing how to handle big data in MySQL.

Also interesting is how the same technology has been applied for use in MongoDB including giving MongoDB transactions; a big change for NoSQL.

Related News: Tokutek Meets Big Data Demand With Open Source TokuDB

Sysbench Benchmark for MongoDB
+1 Vote Up -0Vote Down

As we continue to test our Fractal Tree Indexing with MongoDB, I’ve been updating my benchmark infrastructure so I can compare performance, correctness, and resource utilization.  Sysbench has long been a standard for testing MySQL performance, so I created a version that is compatible with MongoDB.  You can grab my current version of Sysbench for MongoDB here.

So what exactly is Sysbench?  According to the Sysbench homepage, “Sysbench is a modular, cross-platform and multi-threaded benchmark tool for evaluating OS [Operating System] parameters that are important for a system running a database under intensive load.”

  • Sysbench schema
    • 16 copies of the same collection,

  [Read more...]
Tracking 5.3 Billion Mutations: Using MySQL for Genomic Big Data
+1 Vote Up -0Vote Down

University of Montreal Tracks Genomic Data With Tokutek’s TokuDB.

Faster insertion rates, improved scalability and agility support lab’s fast growing research database as it grows from 100s of GBs to 1 TB and beyond.

Issue addressed: MySQL database used for genomic research must be able to quickly ingest huge amounts of incoming data – hundreds of thousands of records every day. It also must be able to retrieve data quickly in response to a diverse set of research requests.

Enabling the Hunt for New Cures for Diseases by Seamlessly Processing Billions of Mutations  [Read more...]

Move over Marcia: Top Ten for 2012
+0 Vote Up -0Vote Down

Well, it’s that time of the year again for top ten lists. There have been many versions showing up on the web the last few days, including Time Magazine’s “Top 10 Everything of 2012″ list, with 55 wide ranging lists!

Last year we started using Google Analytics to see what content for blogs was most popular on Tokutek.com and generated a 2011 top ten list, ending up with a few surprises.  This year saw spikes in some interesting areas as well, including flash performance, NASA and Big Data, and MongoDB.

Without further adieu, here is the top ten list for 2012:

10. Announcing TokuDB v6.1

  [Read more...]
Webinar: Introduction to TokuDB v6.5
+1 Vote Up -0Vote Down

TokuDB® is a proven solution that scales MySQL® and MariaDB® from GBs to TBs with unmatched insert and query speed, compression, replication performance and online schema flexibility. Tokutek’s recently launched TokuDB v6.5 delivers all of these features and more, not just for HDDs, but also for flash memory.

Originally Aired: October 10th
AVAILABLE ON DEMAND

TokuDB v6.5:

  • Stores 10x More Data – TokuDB delivers 10x compression without any performance degradation. Users can therefore take advantage of much greater amounts of available space without paying more for additional storage.
  • Delivers High Insertion Speed – TokuDB

  [Read more...]
Helping to Reduce Page Compression Failures Rate
Employee_Team +1 Vote Up -0Vote Down

When InnoDB compresses a page it needs the result to fit into its predetermined compressed page size (specified with KEY_BLOCK_SIZE). When the result does not fit we call that a compression failure. In this case InnoDB needs to split up the page and try to compress again. That said, compression failures are bad for performance and should be minimized.

Whether the result of the compression will fit largely depends on the data being compressed and some tables and/or indexes may contain more compressible data than others. And so it would be nice if the compression failure rate, along with other compression stats, could be monitored on a per table or even on a per index basis, wouldn't it?

This is where the new INFORMATION_SCHEMA table in MySQL 5.6 kicks in. INFORMATION_SCHEMA.INNODB_CMP_PER_INDEX provides exactly this helpful information. It contains the



  [Read more...]
What compression do you use?
+4 Vote Up -0Vote Down

The following is an evaluation of various compression utilities that I tested when reviewing the various options for MySQL backup strategies. The overall winner in performance was pigz, a parallel implementation of gzip. If you use gzip today as most organizations do, this one change will improve your backup compression times.

Details of the test:

  • The database is 5.4GB of data
  • mysqldump produces a backup file of 2.9GB
  • The server is an AWS t1.xlarge with a dedicated EBS volume for backups

The following testing was performed to compare the time and % compression savings of various available open source products. This was not an exhaustive test with multiple iterations and different types of data files.

Compression
Utility Compression Time
(sec) Decompression Time
(sec) New Size
(% Saving) lzo



  [Read more...]
TokuDB v6.0: Even Better Compression
+1 Vote Up -0Vote Down

A key feature of our new TokuDB v6.0 release, which I have been blogging about this week, is compression. Compression is always on in TokuDB, and the compression we’ve achieved in the past has been quite good. See a previous post on the 18x compression achieved by TokuDB v5.0 on one benchmark. In our latest release, we’ve updated the way compression works and got 50% improvement on compression.

I decided to present numbers on the same set of data as the old post, so see that post for experimental details.

But first, what are the changes? TokuDB compresses large blocks

  [Read more...]
Announcing TokuDB v6.0: Less Slave Lag and More Compression
+1 Vote Up -0Vote Down

We are excited to announce TokuDB® v6.0, the latest version of Tokutek’s flagship storage engine for MySQL and MariaDB.

This version offers feature and performance enhancements over previous releases, support for XA (two-phase transactional commits), better compression, and reduced performance variability associated with checkpointing. This release also brings TokuDB support up to date on MySQL v5.1, MySQL v5.5 and MariaDB v5.2. There’s a lot of great technical stuff under the hood in this release and I’ll be reviewing the improvements one-by-one over the course of this week.

I’ll be posting more details about the new features and performance, so here’s an overview of what’s in store.

Replication Slave Lag One of the things TokuDB does well is single-threaded insertions, which translates directly into  [Read more...]
On InnoDB compression in production
+4 Vote Up -0Vote Down

Our latest changes have been pushed to public mysql@facebook branch, allowing this post to happen \o/

Recently we started rolling out InnoDB compression to our main database tier, and that has been a huge undertaking for multiple teams and a major test for MySQL. Nizam was sure the hero of all this work, and make sure you don’t miss his talk about it at MySQL conference.

Though MySQL manuals have quite some introduction about benefits of compression, we agree that benefits are good – in theory we can do less reads from disk, keep more data in buffer pool or

  [Read more...]
Evidenzia Upgrades to TokuDB v5.2 to Address Storage Growth and Scale Performance
+2 Vote Up -0Vote Down

Ensuring sufficient disk I/O to catch copyright violations at network speed.

Evidenzia GmbH & Co. KG

Issues addressed:

  • Storage growth, including maxed-out disk I/O utilization
  • Performance issues and business impact due to slow selects
  • Inability to revise data schema on the fly

The Company: Evidenzia GmbH & Co. KG is one of the leading partners of the software, movie and music industry when it comes to tracing copyright infringements

  [Read more...]
Announcing TokuDB v5.2: Improved Multi-Client Scaling and Faster Queries
+6 Vote Up -0Vote Down

TokuDB® v5.2, the latest version of Tokutek’s flagship storage engine for MySQL and MariaDB, is now available.

This version offers performance enhancements over previous releases, especially for multi-client scale up and point queries, and extends the cases where ALTER TABLE is non-blocking, in particular adding Hot Column Rename.

TokuDB v5.2 maintains all our established advantages: fast trickle load, fast bulk load, fast range queries through clustering indexes, hot schema changes, great compression, no fragmentation, and full MySQL compatibility for ease of installation. See our benchmark page for details.

Multi-client workloads

In TokuDB v5.2, we have reworked our locking scheme to better support multi-client workloads, and as

  [Read more...]
Are You Forcing MySQL to Do Twice as Many JOINs as Necessary?
+2 Vote Up -0Vote Down
.
Baron Schwartz This guest post is from our friends at Percona. They’re hosting Percona Live London from October 24-25, 2011. Percona Live is a two day summit with 100% technical sessions led by some of the most established speakers in the MySQL field.

In the London area and interested in attending? We are giving away two free passes in the next few days. Watch our @tokutek twitter feed for a chance to win.

Did you know that the following query actually performs a JOIN? You can’t see it, but it’s there:

SELECT the_day, COUNT(*), SUM(clicks),

  [Read more...]
Compression Benchmarking: Size vs. Speed (I want both)
+3 Vote Up -0Vote Down

I’m creating a library of benchmarks and test suites that will run as part of a Continuous Integration (CI) process here at Tokutek. My goal is to regularly measure several aspects of our storage engine over time: performance, correctness, memory/CPU/disk utilization, etc. I’ll also be running tests against InnoDB and other databases for comparative analysis. I plan on posting a series of blog entries as my CI framework evolves, for now I have the results of my first benchmark.

Compression is an always-on feature of TokuDB. There are no server/session variables to enable compression or change the compression level (one goal of TokuDB is to have as few tuning parameters as possible). My compression benchmark uses

  [Read more...]
Failure to compress
+1 Vote Up -0Vote Down

My team has been evaluating the performance of InnoDB compression. One of the workloads is to reload all tables of the database using concurrent connections. This is done with production data and via the insert benchmark. We also modified InnoDB to use qpress/quicklz as an alternative to zlib compression.

 

As I wrote this I realize that this is a great advertisement for a self-tuning DBMS. Alas, MySQL and InnoDB are not self-tuning when you have extreme performance requirements. Eventually they will move in that direction and for many users today they don't require this amount of tuning. Some of the parameters listed below are only in patches for MySQL from Facebook and others. Official MySQL tends to be better about making

  [Read more...]
When does InnoDB compress and decompress pages?
+1 Vote Up -0Vote Down

There are two sections for rows in the page format for InnoDB compressed tables. The compressed section has one or more rows and must be decompressed to access individual rows. The modification log has uncompressed rows and rows can be accessed without decompressing. The modification log is used to avoid decompressing and then possibly recompressing the compressed section on every row change. The buffer pool also has separate uncompressed copies of some pages so that every row read does not require a page decompression.

 

I want to understand when a page must be decompressed or recompressed. This is definitely an incomplete list.

  • A page is decompressed when a row is read and the uncompressed version of the page is not in the buffer pool.
  • I think a row can be
  [Read more...]
How does InnoDB manage the LRU for compressed pages?
+1 Vote Up -0Vote Down

InnoDB uses at least two lists to manage pages in the buffer pool. The LRU is used for pages from compressed and uncompressed tables. If a table is uncompressed then it only uses the LRU. If a table is compressed then compressed pages are on the LRU and the unzip_LRU is used for pages that have an uncompressed and compressed version in the buffer pool. When a server is IO bound then InnoDB allows the unzip_LRU to be 10% the size of the LRU. SHOW INNODB STATUS displays the length of the LRU and unzip_LRU (grep it for "unzip_LRU"). Note that I use the word page in some places when frame might be

  [Read more...]
A few notes on InnoDB PRIMARY KEY
+0 Vote Up -1Vote Down
InnoDB uses an index-organized data storage technique, wherein the primary key acts as the clustered index and this clustered index holds the data. Its for this reason that understanding the basics of InnoDB primary key is very important, and hence the need for these notes.
MySQL 5.5: Improved manageability, efficiency for InnoDB
Employee_Team +2 Vote Up -0Vote Down
In my continuing blog series on MySQL 5.5 features (see performance/scale and replication entries) today I covering some of the new InnoDB manageability and efficiency options.  5.5, with the newly re-architected InnoDB, provides better user control over internal InnoDB settings so things like performance, scale and storage can easily be monitored, tuned and optimized for specific use cases and application loads.

Along these lines, some of the key advances and features available in MySQL 5.5 and InnoDB are:
 
  • Faster Index Creation - MySQL 5.5 can now add or drop indexes without copying the underlying data of the entire




  [Read more...]
MySQL 5.5: Improved manageability, efficiency for InnoDB
Employee_Team +0 Vote Up -0Vote Down
In my continuing blog series on MySQL 5.5 features (see performance/scale and replication entries) today I covering some of the new InnoDB manageability and efficiency options.  5.5, with the newly re-architected InnoDB, provides better user control over internal InnoDB settings so things like performance, scale and storage can easily be monitored, tuned and optimized for specific use cases and application loads.

Along these lines, some of the key advances and features available in MySQL 5.5 and InnoDB are:
 
  • Faster Index Creation - MySQL 5.5 can now add or drop indexes without copying the underlying data of the entire target table. 




  [Read more...]
ZFS and Solaris: storage optimization for the cloud
Employee +1 Vote Up -0Vote Down

Cloud computing has been one of the most discussed topic over the year, and the discussion is not over because what is really being discussed is the way we will access computing an storage resources in the future. Even famous French intellectuals are giving their opinion and making predictions. Future will decide on predictions's accuracy.

What is usually less discussed is the technology behind cloud-computing, though this is no secret that virtualization is playing a key role. Cloud data-centers will be loaded with virtual machines each of these machines potentially requiring in disk-space what a complete operating system (OS) requires, which can go up to many gigabytes. How much disk-space does a virtual machine image (vdi) really consum? The only good answer is: too much.

  [Read more...]
ZFS and Solaris: storage optimization for the cloud
Employee +0 Vote Up -0Vote Down

Cloud computing has been one of the most discussed topic over the year, and the discussion is not over because what is really being discussed is the way we will access computing an storage resources in the future. Even famous French intellectuals are giving their opinion and making predictions. Future will decide on predictions's accuracy.

What is usually less discussed is the technology behind cloud-computing, though this is no secret that virtualization is playing a key role. Cloud data-centers will be loaded with virtual machines each of these machines potentially requiring in disk-space what a complete operating system (OS) requires, which can go up to many gigabytes. How much disk-space does a virtual machine image (vdi) really consum? The only good answer is: too

  [Read more...]
Showing entries 1 to 30 of 33 Next 3 Older Entries

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.