InnoDB performance improvements

The problem
After making several performance fixes, notable among them being the kernel mutex split and the new handling of read-only transaction and in particular non-locking auto-commit read-only transactions, we weren’t seeing any increase in transaction per second (TPS) on our high-end hardware. On this one particular host, a 24 core with 2 threads per core host. The TPS using Sysbench was a tepid 5.6K at 16 threads and more or less plateaued till 1K user threads. No matter what config setting we used, we would more or less end up with the same result.

We ended up getting together for a meeting at Paris to discuss this issue and during the brain storming, one of the potential issues that cropped up was the effect of cache coherence and/or false sharing. After using the excellent Linux tool perf we were able to narrow it down to a global statistic counter in row_sel_search_for_mysql(). Mikael Ronstrom explains this in more detail.

The solution
Create a generic counter class (InnoDB code is now C++) that splits the counter into multiple (configurable) slots that are on separate 64 byte cache lines. Use the thread id of the updating thread to index into a slot to reduce the contention/sharing and it had the desired effect. The TPS went from 5.6 to 15K at 64 user threads and stayed close to stable right up to 1K, very slow degradation. This was using Sysbench OLTP_RO for autocommit-non-locking-read-only queries (Sysench option –oltp-skip-trx=off).

The code and binary can be downloaded from labs release downloads, the current release is mysql-5.6.6-labs-april-2012-*. See the code in include/os0thread.h. The new class is ib_counter_t.

We have now refactored the code and grouped all the InnoDB statistic counters in srv_counter_t. This will help in further consolidation and improvements. Currently, most of the InnoDB config and statistics variables are defined in (with a few exceptions). We need to start paying even more attention to their layout and alignment from now on. There seem to be some false sharing issues that we haven’t completely identified yet.

I think it is better to look at Dimitri’s blog for results that reflect the improvements.