MySQL 5.6 vs MySQL 5.5 and the Star Schema Benchmark

MySQL 5.6 vs MySQL 5.5 & the Star Schema Benchmark

So far most of the benchmarks posted about MySQL 5.6 use the sysbench OLTP workload.  I wanted to test a set of queries which, unlike sysbench, utilize joins.  I also wanted an easily reproducible set of data which is more rich than the simple sysbench table.  The Star Schema Benchmark (SSB) seems ideal for this.

I wasn’t going to focus on the performance of individual queries in this post, but instead intended to focus only on the overall response time for answering all of the queries in the benchmark. I got some strange results, however, which showed MySQL 5.6.10 to be much slower than MySQL 5.5.30 even with only a single connection. I felt these results warranted deeper investigation, so I did some research and detailed my findings here.

Just a few notes:
I tested two scenarios: a buffer pool much smaller than the data set (default size of 128MB, which is 1/8th of the data) and I also testing a 4G buffer pool, which is larger than the data. Very little tuning was done. The goal was to see how MySQL 5.6 performs out-of-the-box as compared to 5.5.30 with default settings. The non-default settings were tried to dig deeper into performance differences and are documented in the post.

This blog post is not a definitive conclusion about innodb_old_blocks_pct or innodb_old_blocks_time. It does highlight how a data set much larger than the buffer pool may perform worse with innodb_old_blocks_time=1000, but as I said this needs further investigation. One particular point of investigation which needs to be followed up on, including testing innodb_old_blocks_time=1000 on MySQL 5.5.30 and testing multiple buffer pools on MySQL 5.5.30. Finally, MySQL 5.6.10 has many additional tuning options which must be investigated (MRR, BKA, ICP, etc) before coming to further conclusions. These will be the topic of further blog posts.

Benchmark Details:

The SSB employs a data generator which produces data for a star schema.  Star schema are commonly used for analytics because it is extremely easy to construct queries against.  It is also very easy to define an OLAP cube over a star schema, so they are popular for use with tools like Mondrian and also for data mining.  I wrote an earlier blog post which describes the differences between major schema types.

  • I used the SSB data set at scale factor 1.  Scale factor 1 results in 587MB of raw data, mostly in one  table (lineorder).
  • Each of the 13 queries were executed serially in a single connection
  • I modified the queries to use ANSI JOIN syntax.  No other changes to the queries were made.

Test Environment

  • The MySQL versions used at the time of this post are 5.5.30 and 5.6.10, each of which are GA when this was written.
    • I compiled both servers from source (cmake -gui .; make; make install)
    • Only changes from defaults was that both servers are compiled without the PERFORMANCE_SCHEMA, and paths are unique for basedir and datadir
  • I tested three configurations:
    • Config 1: Default config for MySQL 5.5 and MySQL 5.6, no tuning at all
    • Config 2: MySQL 5.6 with all default settings except innodb_old_blocks_time=0
    • Config 3: MySQL 5.5 and 5.6 with a 4G buffer pool instead of the default 128M

Rationale:

  • Since O_DIRECT is not used by default, the file system cache will give better read performance after first run (but not as good as warm buffer pool)
  • Thus, the results marked COLD are the results after the server reboot, when the FS cache is cold
  • The remaining results are runs without a server restart.  For the default size BP, this means the FS cache is warm.  For the 4G BP, the BP is completely warm.
    • The idea here is to test the situation when the buffer pool is smaller than data and the IO is slow (when the  FS cache is cold, IO to slow IO subsystem happens)
    • Repeated runs test a buffer pool which is smaller than the data but underlying IO is fast (a warm FS cache reduces IO cost significantly)
    • And finally, testing with a 4G buffer pool shows how the system performs when the data fits completely into the buffer pool (no IO on repeat runs)

Test Server:

    • Intel core i970-3.20GHz.  12 logical cores (six physical cores).
    • 12GB memory
    • 4 disk 7200RPM RAID 10 array with 512MB write-back cache

 Star Schema Benchmark – Scale Factor 1 – Mysql 5.5 vs 5.6
response times are in seconds (lower is better)

Version Buffer Cold Run1 Run2 Run3
5.5.30 128M 361.49 189.29 189.34 189.40
5.6.10 128M 362.31 324.25 320.74 318.84
5.6.10 (innodb_old_blocks_time=0) 128M 349.24 178.80 178.55 179.07
5.5.30 4G 200.87 20.53 20.36 20.35
5.6.10 4G 195.33 14.41 14.45 14.61

I started by running the benchmark against MySQL 5.5.30.  It took 361.49 seconds to complete all 13 queries.  I then repeated the run three more times.  The speed is very consistent, just a few tenths of a second off per run.  I then rebooted the machine and fired up 5.6.10.   I ran the test, and to my surprise MySQL 5.6.10 did not get much faster during the repeat runs, compared to the initial cold run.  I stopped the MySQL 5.6 server, rebooted and verified again.  Same issue.  This was very different from MySQL 5.5.30, which performs significantly better on the repeat warm runs.

Just to be sure it wasn’t a disk problem, I pointed the MySQL 5.6.10 at the MySQL 5.5.30 data directory.  Tthe speed was essentially the same.   I did some further investigation and I determined that there was a lower buffer pool hit ratio during the MySQL 5.6 runs and MySQL 5.6.10 was doing more IO as a consequence.  To confirm that this was indeed the problem I decided to compare performance with a buffer pool much larger than the data size, so I configured the server with a 4GB buffer pool.  I tested both versions, and as you can see above, MySQL 5.6 outperformed MySQL 5.5.30 with the big buffer pool.

Why is the MySQL 5.6.10 with default settings test significantly slower than MySQL 5.5.30 in repeat runs?

I thought about the differences in the defaults between MySQL 5.5 and MySQL 5.6 and innodb_old_blocks_time immediately came to mind.  The InnoDB plugin introduced innodb_old_blocks_time to help control the behavior of the new split LRU mechanism which was implemented in the plugin.  In the original InnoDB, the LRU was implemented as a classic LRU which is subject to “pollution” by full table scans.  In the classic LRU, a full table scan pushes out important hot pages from the buffer pool often for an infrequent scan, like a backup or report.  In an OLTP system this can have very negative performance consequences.

The plugin attempts to fix this problem by splitting the LRU into hot and cold sections.  When a page is first read into the buffer pool it is first placed onto the head of the cold section of the LRU, where it begins to age of naturally.  If the page is touched again while on the cold portion, it is moved to the head of the hot portion.

This sounds good in theory, but in practice it is problematic.  What usually happens is that the full table scans access the table by primary key.  This forces the storage engine to touch the same page numerous times in rapid succession.  This invariably moves the page onto the hot area, defeating the split.  In order to prevent this from happening, another variable innodb_old_blocks_time was introduced.

Innodb_old_blocks_time controls how long a page must be on the cold portion of the LRU before it is eligible to be moved to the hot portion.  In MySQL 5.5 and earlier, innodb_old_blocks_time defaults to a value of 0(zero), which means that pages move rapidly from the cold portion to the hot portion because they must stay on the cold LRU for zero milliseconds before being able to move to the hot list.  In MySQL 5.6 the default value of innodb_old_blocks_time is changed to 1000.   The location at which a page is initially placed into the LRU is defined by innodb_old_blocks_pct.  The default value on both versions is 38, which happens to be 3/8 of the buffer pool.

For this workload with a small buffer pool (the buffer pool is smaller than the working set) having innodb_old_blocks_time=1000 appears to cause a major performance regression.  The new setting  changes which pages end up staying in the buffer pool, and which are aged out.

Digging into why innodb_old_blocks_time change the performance?Each “flight” of queries represents a set of drill-down queries to find an anomaly.  I am going to focus on the first query, which uses only one join. Since it is practical for a query with only one join, I’ve tested performance of the query with the join in both directions.Explain for query Q1.1:

mysql> select straight_join sum(lo_extendedprice*lo_discount) as revenue
from lineorder join  dim_date on lo_orderdatekey = d_datekey
where d_year = 1993 and lo_discount between 1 and 3 and lo_quantity < 25;
+--------------+
| revenue      |
+--------------+
| 446268068091 |
+--------------+
1 row in set (33.94 sec)
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: lineorder
         type: ALL
possible_keys: LO_OrderDateKey
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 5996539
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: dim_date
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: ssb.lineorder.LO_OrderDateKey
         rows: 1
        Extra: Using where
2 rows in set (0.00 sec)

After running the query, see how many pages were read from disk versus how many page requests their were: 

mysql> select * from information_schema.global_status where variable_name like '%innodb%read%';
+---------------------------------------+----------------+
| VARIABLE_NAME                         | VARIABLE_VALUE |
+---------------------------------------+----------------+
| INNODB_BUFFER_POOL_READ_AHEAD_RND     | 0              |
| INNODB_BUFFER_POOL_READ_AHEAD         | 38392          |
| INNODB_BUFFER_POOL_READ_AHEAD_EVICTED | 0              |
| INNODB_BUFFER_POOL_READ_REQUESTS      | 6731100        |
| INNODB_BUFFER_POOL_READS              | 570            |
| INNODB_DATA_PENDING_READS             | 0              |
| INNODB_DATA_READ                      | 640536576      |
| INNODB_DATA_READS                     | 38972          |
| INNODB_PAGES_READ                     | 38961          |
| INNODB_ROWS_READ                      | 6611119        |
+---------------------------------------+----------------+
10 rows in set (0.00 sec)
Here are the contents of the buffer pool in pages afterwards:
mysql> select sq.*, pages / (@@innodb_buffer_pool_size / 16384) * 100 pct_buffer_pool
from (
select table_name, index_name, count(*) pages, sum(is_old='YES') old,
count(*) - sum(is_old='YES') hot, sum(number_records) records
from information_schema.innodb_buffer_page_lru
where  table_name like '%ssb%' group by 1,2
) sq
order by pct_buffer_pool desc;
+-------------------+------------------+-------+------+------+---------+-----------------+
| table_name        | index_name       | pages | old  | hot  | records | pct_buffer_pool |
+-------------------+------------------+-------+------+------+---------+-----------------+
| `ssb`.`lineorder` | GEN_CLUST_INDEX  |  6909 | 2559 | 4350 | 1083172 |         84.3384 |
| `ssb`.`lineorder` | LO_PartKey       |    17 |    0 |   17 |    9979 |          0.2075 |
| `ssb`.`lineorder` | LO_CommitDateKey |    17 |    0 |   17 |   10776 |          0.2075 |
| `ssb`.`lineorder` | LO_OrderDateKey  |    17 |    0 |   17 |   10376 |          0.2075 |
| `ssb`.`dim_date`  | PRIMARY          |    17 |    0 |   17 |    2481 |          0.2075 |
| `ssb`.`lineorder` | LO_CustKey       |    16 |    0 |   16 |    8616 |          0.1953 |
| `ssb`.`lineorder` | LO_OrderKey      |    16 |    0 |   16 |   10943 |          0.1953 |
| `ssb`.`lineorder` | LO_SuppKey       |    15 |    0 |   15 |   11466 |          0.1831 |
+-------------------+------------------+-------+------+------+---------+-----------------+
8 rows in set (0.12 sec)
And the Innodb stats:
mysql> select * from information_schema.innodb_buffer_pool_stats\G
*************************** 1. row ***************************
                         POOL_ID: 0
                       POOL_SIZE: 8191
                    FREE_BUFFERS: 1024
                  DATABASE_PAGES: 7162
              OLD_DATABASE_PAGES: 2623
         MODIFIED_DATABASE_PAGES: 0
              PENDING_DECOMPRESS: 0
                   PENDING_READS: 0
               PENDING_FLUSH_LRU: 0
              PENDING_FLUSH_LIST: 0
                PAGES_MADE_YOUNG: 3
            PAGES_NOT_MADE_YOUNG: 4824154
           PAGES_MADE_YOUNG_RATE: 0
       PAGES_MADE_NOT_YOUNG_RATE: 0
               NUMBER_PAGES_READ: 38960
            NUMBER_PAGES_CREATED: 0
            NUMBER_PAGES_WRITTEN: 1
                 PAGES_READ_RATE: 0
               PAGES_CREATE_RATE: 0
              PAGES_WRITTEN_RATE: 0
                NUMBER_PAGES_GET: 6731253
                        HIT_RATE: 0
    YOUNG_MAKE_PER_THOUSAND_GETS: 0
NOT_YOUNG_MAKE_PER_THOUSAND_GETS: 0
         NUMBER_PAGES_READ_AHEAD: 38457
       NUMBER_READ_AHEAD_EVICTED: 0
                 READ_AHEAD_RATE: 0
         READ_AHEAD_EVICTED_RATE: 0
                    LRU_IO_TOTAL: 431
                  LRU_IO_CURRENT: 0
                UNCOMPRESS_TOTAL: 0
              UNCOMPRESS_CURRENT: 0
1 row in set (0.00 sec)

Now compare the difference with innodb_old_blocks_time=0;

mysql> set global innodb_old_blocks_time=0;
Query OK, 0 rows affected (0.00 sec)
mysql> select straight_join sum(lo_extendedprice*lo_discount) as revenue
from lineorder join  dim_date on lo_orderdatekey = d_datekey
where d_year = 1993 and lo_discount between 1 and 3 and lo_quantity < 25\G
*************************** 1. row ***************************
revenue: 446268068091
1 row in set (7.81 sec)
mysql> select * from information_schema.global_status where variable_name like '%innodb%read%';
+---------------------------------------+----------------+
| VARIABLE_NAME                         | VARIABLE_VALUE |
+---------------------------------------+----------------+
| INNODB_BUFFER_POOL_READ_AHEAD_RND     | 0              |
| INNODB_BUFFER_POOL_READ_AHEAD         | 38461          |
| INNODB_BUFFER_POOL_READ_AHEAD_EVICTED | 0              |
| INNODB_BUFFER_POOL_READ_REQUESTS      | 6731687        |
| INNODB_BUFFER_POOL_READS              | 550            |
| INNODB_DATA_PENDING_READS             | 0              |
| INNODB_DATA_READ                      | 641339392      |
| INNODB_DATA_READS                     | 39021          |
| INNODB_PAGES_READ                     | 39010          |
| INNODB_ROWS_READ                      | 6611119        |
+---------------------------------------+----------------+
10 rows in set (0.00 sec)
mysql> select sq.*, pages / (@@innodb_buffer_pool_size / 16384) * 100 pct_buffer_pool
from (
select table_name, index_name, count(*) pages, sum(is_old='YES') old,
count(*) - sum(is_old='YES') hot, sum(number_records) records
from information_schema.innodb_buffer_page_lru
where  table_name like '%ssb%' group by 1,2
) sq
order by pct_buffer_pool desc;
+-------------------+-----------------+-------+------+------+---------+-----------------+
| table_name        | index_name      | pages | old  | hot  | records | pct_buffer_pool |
+-------------------+-----------------+-------+------+------+---------+-----------------+
| `ssb`.`lineorder` | GEN_CLUST_INDEX |  7085 | 2547 | 4538 | 1104291 |         86.4868 |
| `ssb`.`dim_date`  | PRIMARY         |    17 |   17 |    0 |    2481 |          0.2075 |
+-------------------+-----------------+-------+------+------+---------+-----------------+
2 rows in set (0.11 sec)
So there is more of lineorder in the buffer pool and the other secondary indexes have been pushed out of the buffer pool.
mysql> select * from information_schema.innodb_buffer_pool_stats\G
*************************** 1. row ***************************
                         POOL_ID: 0
                       POOL_SIZE: 8192
                    FREE_BUFFERS: 1024
                  DATABASE_PAGES: 7163
              OLD_DATABASE_PAGES: 2624
         MODIFIED_DATABASE_PAGES: 0
              PENDING_DECOMPRESS: 0
                   PENDING_READS: 0
               PENDING_FLUSH_LRU: 0
              PENDING_FLUSH_LIST: 0
                PAGES_MADE_YOUNG: 29501
            PAGES_NOT_MADE_YOUNG: 0
           PAGES_MADE_YOUNG_RATE: 951.6144640495468
       PAGES_MADE_NOT_YOUNG_RATE: 0
               NUMBER_PAGES_READ: 39009
            NUMBER_PAGES_CREATED: 0
            NUMBER_PAGES_WRITTEN: 1
                 PAGES_READ_RATE: 1249.8306506241734
               PAGES_CREATE_RATE: 0
              PAGES_WRITTEN_RATE: 0.032257023966968806
                NUMBER_PAGES_GET: 6731790
                        HIT_RATE: 995
    YOUNG_MAKE_PER_THOUSAND_GETS: 4
NOT_YOUNG_MAKE_PER_THOUSAND_GETS: 0
         NUMBER_PAGES_READ_AHEAD: 38459
       NUMBER_READ_AHEAD_EVICTED: 0
                 READ_AHEAD_RATE: 1240.5728847456533
         READ_AHEAD_EVICTED_RATE: 0
                    LRU_IO_TOTAL: 531
                  LRU_IO_CURRENT: 0
                UNCOMPRESS_TOTAL: 0
              UNCOMPRESS_CURRENT: 0
1 row in set (0.01 sec)

Here is the difference between innodb_buffer_pool_stats side by side:

INNODB_OLD_BLOCKS_TIME=0                     INNODB_OLD_BLOCKS_TIME=1000
*************************** 1. row ************************************************ 1. row *******
                         POOL_ID: 0                    *                         POOL_ID: 0
                       POOL_SIZE: 8192                 *                       POOL_SIZE: 8191
                    FREE_BUFFERS: 1024                 *                    FREE_BUFFERS: 1024
                  DATABASE_PAGES: 7163                 *                  DATABASE_PAGES: 7162
              OLD_DATABASE_PAGES: 2624                 *              OLD_DATABASE_PAGES: 2623
         MODIFIED_DATABASE_PAGES: 0                    *         MODIFIED_DATABASE_PAGES: 0
              PENDING_DECOMPRESS: 0                    *              PENDING_DECOMPRESS: 0
                   PENDING_READS: 0                    *                   PENDING_READS: 0
               PENDING_FLUSH_LRU: 0                    *               PENDING_FLUSH_LRU: 0
              PENDING_FLUSH_LIST: 0                    *              PENDING_FLUSH_LIST: 0
                PAGES_MADE_YOUNG: 29501                *                PAGES_MADE_YOUNG: 3
            PAGES_NOT_MADE_YOUNG: 0                    *            PAGES_NOT_MADE_YOUNG: 4824154
           PAGES_MADE_YOUNG_RATE: 951.6144640495468    *           PAGES_MADE_YOUNG_RATE: 0
       PAGES_MADE_NOT_YOUNG_RATE: 0                    *       PAGES_MADE_NOT_YOUNG_RATE: 0
               NUMBER_PAGES_READ: 39009                *               NUMBER_PAGES_READ: 38960
            NUMBER_PAGES_CREATED: 0                    *            NUMBER_PAGES_CREATED: 0
            NUMBER_PAGES_WRITTEN: 1                    *            NUMBER_PAGES_WRITTEN: 1
                 PAGES_READ_RATE: 1249.8306506241734   *                 PAGES_READ_RATE: 0
               PAGES_CREATE_RATE: 0                    *               PAGES_CREATE_RATE: 0
              PAGES_WRITTEN_RATE: 0.032257023966968806 *              PAGES_WRITTEN_RATE: 0
                NUMBER_PAGES_GET: 6731790              *                NUMBER_PAGES_GET: 6731253
                        HIT_RATE: 995                  *                        HIT_RATE: 0
    YOUNG_MAKE_PER_THOUSAND_GETS: 4                    *    YOUNG_MAKE_PER_THOUSAND_GETS: 0
NOT_YOUNG_MAKE_PER_THOUSAND_GETS: 0                    *NOT_YOUNG_MAKE_PER_THOUSAND_GETS: 0
         NUMBER_PAGES_READ_AHEAD: 38459                *         NUMBER_PAGES_READ_AHEAD: 38457
       NUMBER_READ_AHEAD_EVICTED: 0                    *       NUMBER_READ_AHEAD_EVICTED: 0
                 READ_AHEAD_RATE: 1240.5728847456533   *                 READ_AHEAD_RATE: 0
         READ_AHEAD_EVICTED_RATE: 0                    *         READ_AHEAD_EVICTED_RATE: 0
                    LRU_IO_TOTAL: 531                  *                    LRU_IO_TOTAL: 431
                  LRU_IO_CURRENT: 0                    *                  LRU_IO_CURRENT: 0
                UNCOMPRESS_TOTAL: 0                    *                UNCOMPRESS_TOTAL: 0
              UNCOMPRESS_CURRENT: 0                                   UNCOMPRESS_CURRENT: 0

As promised, here are the results from joining the tables in the other direction

mysql> select straight_join sum(lo_extendedprice*lo_discount) as revenue
from dim_date join lineorder on lo_orderdatekey = d_datekey
where d_year = 1993 and lo_discount between 1 and 3
and lo_quantity < 25\G
*************************** 1. row ***************************
revenue: 446268068091
1 row in set (22.54 sec)
mysql> explain select straight_join sum(lo_extendedprice*lo_discount) as revenue
    -> from dim_date join lineorder on lo_orderdatekey = d_datekey
    -> where d_year = 1993 and lo_discount between 1 and 3
    -> and lo_quantity < 25\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: dim_date
         type: ALL
possible_keys: PRIMARY
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2704
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: lineorder
         type: ref
possible_keys: LO_OrderDateKey
          key: LO_OrderDateKey
      key_len: 4
          ref: ssb.dim_date.D_DateKey
         rows: 2837
        Extra: Using where
2 rows in set (0.00 sec)
mysql> select * from information_schema.global_status where variable_name like '%innodb%read%';
+---------------------------------------+----------------+
| VARIABLE_NAME                         | VARIABLE_VALUE |
+---------------------------------------+----------------+
| INNODB_BUFFER_POOL_READ_AHEAD_RND     | 0              |
| INNODB_BUFFER_POOL_READ_AHEAD         | 0              |
| INNODB_BUFFER_POOL_READ_AHEAD_EVICTED | 0              |
| INNODB_BUFFER_POOL_READ_REQUESTS      | 3776369        |
| INNODB_BUFFER_POOL_READS              | 191571         |
| INNODB_DATA_PENDING_READS             | 0              |
| INNODB_DATA_READ                      | 3140882432     |
| INNODB_DATA_READS                     | 191581         |
| INNODB_PAGES_READ                     | 191570         |
| INNODB_ROWS_READ                      | 910844         |
+---------------------------------------+----------------+
10 rows in set (0.01 sec)
mysql> select sq.*, pages / ((@@innodb_buffer_pool_size / 16384)) * 100 pct_buffer_pool from (select table_name, index_name, count(*) pages, sum(is_old='YES') old, count(*) - sum(is_old='YES') hot, sum(number_records) records from information_schema.innodb_buffer_page_lru where  table_name like '%ssb%' group by 1,2) sq order by pct_buffer_pool desc;
+-------------------+------------------+-------+------+------+---------+-----------------+
| table_name        | index_name       | pages | old  | hot  | records | pct_buffer_pool |
+-------------------+------------------+-------+------+------+---------+-----------------+
| `ssb`.`lineorder` | GEN_CLUST_INDEX  |  6001 | 2095 | 3906 |  964974 |         73.2544 |
| `ssb`.`lineorder` | LO_OrderDateKey  |    31 |   28 |    3 |   18223 |          0.3784 |
| `ssb`.`dim_date`  | PRIMARY          |    17 |   11 |    6 |    2414 |          0.2075 |
| `ssb`.`lineorder` | LO_OrderKey      |    17 |   17 |    0 |   11320 |          0.2075 |
| `ssb`.`lineorder` | LO_PartKey       |    17 |   17 |    0 |   10095 |          0.2075 |
| `ssb`.`lineorder` | LO_CustKey       |    17 |   17 |    0 |    9874 |          0.2075 |
| `ssb`.`lineorder` | LO_CommitDateKey |    16 |   16 |    0 |   10775 |          0.1953 |
| `ssb`.`lineorder` | LO_SuppKey       |    16 |   16 |    0 |   11879 |          0.1953 |
+-------------------+------------------+-------+------+------+---------+-----------------+
8 rows in set (0.11 sec)
mysql> select * from information_schema.innodb_buffer_pool_stats\G
*************************** 1. row ***************************
                         POOL_ID: 0
                       POOL_SIZE: 8192
                    FREE_BUFFERS: 1024
                  DATABASE_PAGES: 6175
              OLD_DATABASE_PAGES: 2259
         MODIFIED_DATABASE_PAGES: 0
              PENDING_DECOMPRESS: 0
                   PENDING_READS: 0
               PENDING_FLUSH_LRU: 0
              PENDING_FLUSH_LIST: 0
                PAGES_MADE_YOUNG: 62
            PAGES_NOT_MADE_YOUNG: 2054952
           PAGES_MADE_YOUNG_RATE: 1.0508296469551364
       PAGES_MADE_NOT_YOUNG_RATE: 34829.104591447605
               NUMBER_PAGES_READ: 191834
            NUMBER_PAGES_CREATED: 0
            NUMBER_PAGES_WRITTEN: 1
                 PAGES_READ_RATE: 3246.91106930391
               PAGES_CREATE_RATE: 0
              PAGES_WRITTEN_RATE: 0.01694886527346994
                NUMBER_PAGES_GET: 3777151
                        HIT_RATE: 950
    YOUNG_MAKE_PER_THOUSAND_GETS: 0
NOT_YOUNG_MAKE_PER_THOUSAND_GETS: 544
         NUMBER_PAGES_READ_AHEAD: 0
       NUMBER_READ_AHEAD_EVICTED: 0
                 READ_AHEAD_RATE: 0
         READ_AHEAD_EVICTED_RATE: 0
                    LRU_IO_TOTAL: 186940
                  LRU_IO_CURRENT: 0
                UNCOMPRESS_TOTAL: 0
              UNCOMPRESS_CURRENT: 0
1 row in set (0.00 sec)

And with innodb_old_blocks_time=0:

mysql> select straight_join sum(lo_extendedprice*lo_discount) as revenue
from dim_date join lineorder on lo_orderdatekey = d_datekey
where d_year = 1993 and lo_discount between 1 and 3 and lo_quantity < 25\G
*************************** 1. row ***************************
revenue: 446268068091
1 row in set (12.36 sec)
mysql> select * from information_schema.global_status where variable_name like '%innodb%read%';
+---------------------------------------+----------------+
| VARIABLE_NAME                         | VARIABLE_VALUE |
+---------------------------------------+----------------+
| INNODB_BUFFER_POOL_READ_AHEAD_RND     | 0              |
| INNODB_BUFFER_POOL_READ_AHEAD         | 0              |
| INNODB_BUFFER_POOL_READ_AHEAD_EVICTED | 0              |
| INNODB_BUFFER_POOL_READ_REQUESTS      | 3811806        |
| INNODB_BUFFER_POOL_READS              | 186407         |
| INNODB_DATA_PENDING_READS             | 0              |
| INNODB_DATA_READ                      | 3056275456     |
| INNODB_DATA_READS                     | 186417         |
| INNODB_PAGES_READ                     | 186406         |
| INNODB_ROWS_READ                      | 910844         |
+---------------------------------------+----------------+
10 rows in set (0.00 sec)
mysql> select sq.*, pages / ((@@innodb_buffer_pool_size / 16384)) * 100 pct_buffer_pool
from (
select table_name, index_name, count(*) pages,
sum(is_old='YES') old, count(*) - sum(is_old='YES') hot,
sum(number_records) records
from information_schema.innodb_buffer_page_lru
where  table_name like '%ssb%'
group by 1,2
) sq
order by pct_buffer_pool desc;
+-------------------+-----------------+-------+------+------+---------+-----------------+
| table_name        | index_name      | pages | old  | hot  | records | pct_buffer_pool |
+-------------------+-----------------+-------+------+------+---------+-----------------+
| `ssb`.`lineorder` | GEN_CLUST_INDEX |  6980 | 2563 | 4417 | 1119893 |         85.2051 |
| `ssb`.`lineorder` | LO_OrderDateKey |    47 |   17 |   30 |   30637 |          0.5737 |
| `ssb`.`dim_date`  | PRIMARY         |    12 |    0 |   12 |    1841 |          0.1465 |
+-------------------+-----------------+-------+------+------+---------+-----------------+
3 rows in set (0.12 sec)
mysql> select * from information_schema.innodb_buffer_pool_stats\G
*************************** 1. row ***************************
                         POOL_ID: 0
                       POOL_SIZE: 8192
                    FREE_BUFFERS: 1024
                  DATABASE_PAGES: 7047
              OLD_DATABASE_PAGES: 2581
         MODIFIED_DATABASE_PAGES: 0
              PENDING_DECOMPRESS: 0
                   PENDING_READS: 0
               PENDING_FLUSH_LRU: 0
              PENDING_FLUSH_LIST: 0
                PAGES_MADE_YOUNG: 194023
            PAGES_NOT_MADE_YOUNG: 0
           PAGES_MADE_YOUNG_RATE: 4850.4537386565335
       PAGES_MADE_NOT_YOUNG_RATE: 0
               NUMBER_PAGES_READ: 186422
            NUMBER_PAGES_CREATED: 0
            NUMBER_PAGES_WRITTEN: 1
                 PAGES_READ_RATE: 4653.858653533662
               PAGES_CREATE_RATE: 0
              PAGES_WRITTEN_RATE: 0.02499937501562461
                NUMBER_PAGES_GET: 3811961
                        HIT_RATE: 952
    YOUNG_MAKE_PER_THOUSAND_GETS: 50
NOT_YOUNG_MAKE_PER_THOUSAND_GETS: 0
         NUMBER_PAGES_READ_AHEAD: 0
       NUMBER_READ_AHEAD_EVICTED: 0
                 READ_AHEAD_RATE: 0
         READ_AHEAD_EVICTED_RATE: 0
                    LRU_IO_TOTAL: 186024
                  LRU_IO_CURRENT: 0
                UNCOMPRESS_TOTAL: 0
              UNCOMPRESS_CURRENT: 0
1 row in set (0.00 sec)

Finally, I collected SHOW PROFILES information for the faster join direction (fact -> dimension)

select p1.seq, p1.state state, p1.duration, p2.duration, p1.cpu_user + p1.cpu_system p1_cpu, p2.cpu_user + p2.cpu_system p2_cpu,
p1.context_voluntary + p1.context_involuntary p1_cs, p2.context_voluntary + p2.context_involuntary p2_cs,
p1.block_ops_in + p1.block_ops_out p1_block_ops, p2.block_ops_in + p2.block_ops_out p2_block_ops,
p1.page_faults_major + p1.page_faults_minor p1_pf, p2.page_faults_major + p2.page_faults_minor p2_pf
from p1 join p2 using(seq)
where p1.state = p2.state
order by p1.duration desc;
+-----+----------------------+-----------+----------+-----------+-----------+-------+-------+--------------+--------------+-------+-------+
| seq | state                | duration  | duration | p1_cpu    | p2_cpu    | p1_cs | p2_cs | p1_block_ops | p2_block_ops | p1_pf | p2_pf |
+-----+----------------------+-----------+----------+-----------+-----------+-------+-------+--------------+--------------+-------+-------+
|  12 | Sending data         | 33.764396 | 7.523023 | 40.173893 | 13.027019 |  4979 | 21399 |            0 |            0 |    90 |    90 |
|   5 | Opening tables       |  0.270664 | 0.295955 |  0.025996 |  0.024996 |    34 |    35 |         2056 |         1488 |    48 |    48 |
|   2 | starting             |  0.000230 | 0.000192 |  0.000000 |  0.000000 |     0 |     0 |            0 |            0 |    29 |    29 |
|   9 | statistics           |  0.000130 | 0.000097 |  0.000000 |  0.000000 |     0 |     0 |            0 |            0 |    26 |    26 |
|   6 | init                 |  0.000105 | 0.000138 |  0.000000 |  0.000000 |     0 |     0 |            0 |            0 |    26 |    26 |
|  10 | preparing            |  0.000068 | 0.000064 |  0.000000 |  0.000000 |     0 |     0 |            0 |            0 |    14 |    14 |
|  16 | freeing items        |  0.000049 | 0.000117 |  0.000000 |  0.001000 |     0 |     0 |            0 |            0 |     3 |     3 |
|   8 | optimizing           |  0.000048 | 0.000048 |  0.000000 |  0.000000 |     0 |     0 |            0 |            0 |     8 |     8 |
|   7 | System lock          |  0.000031 | 0.000031 |  0.000000 |  0.000000 |     0 |     0 |            0 |            0 |     1 |     1 |
|  13 | end                  |  0.000027 | 0.000026 |  0.000000 |  0.000000 |     0 |     0 |            0 |            0 |     1 |     1 |
|   4 | checking permissions |  0.000015 | 0.000014 |  0.000000 |  0.000000 |     0 |     0 |            0 |            0 |     0 |     0 |
|  15 | closing tables       |  0.000015 | 0.000016 |  0.000000 |  0.000000 |     0 |     0 |            0 |            0 |     0 |     0 |
|   3 | checking permissions |  0.000014 | 0.000014 |  0.000000 |  0.000000 |     0 |     0 |            0 |            0 |     0 |     0 |
|  11 | executing            |  0.000013 | 0.000013 |  0.000000 |  0.000000 |     0 |     0 |            0 |            0 |     0 |     0 |
|  14 | query end            |  0.000011 | 0.000012 |  0.000000 |  0.000000 |     0 |     0 |            0 |            0 |     0 |     0 |
+-----+----------------------+-----------+----------+-----------+-----------+-------+-------+--------------+--------------+-------+-------+
15 rows in set (0.00 sec)

 Here are my modified versions of the queries (just to use ANSI JOIN syntax):

-- Q1.1
select sum(lo_extendedprice*lo_discount) as
revenue
from lineorder join  dim_date on lo_orderdatekey = d_datekey
where
d_year = 1993
and lo_discount between 1 and 3
and lo_quantity < 25;
-- Q1.2
select sum(lo_extendedprice*lo_discount) as revenue
from lineorder
join  dim_date on lo_orderdatekey = d_datekey
where  d_yearmonth = 199401 and lo_discount
between 4 and 6 and lo_quantity between 26 and 35;
-- Q1.3
select sum(lo_extendedprice*lo_discount) as revenue
from lineorder
join  dim_date on lo_orderdatekey = d_datekey
where  d_weeknuminyear = 6
and d_year = 1994
and lo_discount between 5 and 7 and lo_quantity between 26 and 35;
-- Q2.1
select sum(lo_revenue), d_year, p_brand
from lineorder
join dim_date
 on lo_orderdatekey = d_datekey
join part
on lo_partkey = p_partkey join supplier
on lo_suppkey = s_suppkey
where  p_category = 'MFGR#12'
and s_region = 'AMERICA'
group by d_year, p_brand
order by d_year, p_brand;
-- Q2.2
select sum(lo_revenue), d_year, p_brand
from lineorder
join dim_date
on lo_orderdatekey = d_datekey
join part
on lo_partkey = p_partkey
join supplier
on lo_suppkey = s_suppkey
where  p_brand between 'MFGR#2221' and 'MFGR#2228'
and s_region = 'ASIA'
group by d_year, p_brand
order by d_year, p_brand;
-- Q2.3
select sum(lo_revenue), d_year, p_brand
from lineorder
join dim_date
on lo_orderdatekey = d_datekey
join part
on lo_partkey = p_partkey
join supplier
on lo_suppkey = s_suppkey
where  p_brand= 'MFGR#2239'
and s_region = 'EUROPE'
group by d_year, p_brand
order by d_year, p_brand;
-- Q3.1
select c_nation, s_nation, d_year, sum(lo_revenue) as revenue
from customer
join lineorder
on lo_custkey = c_customerkey
join supplier
on lo_suppkey = s_suppkey
join dim_date   on lo_orderdatekey = d_datekey
where c_region = 'ASIA'
and s_region = 'ASIA'
and d_year <= 1992 and d_year >= 1997
group by c_nation, s_nation, d_year
order by d_year asc, revenue desc;
-- Q3.2
select c_city, s_city, d_year, sum(lo_revenue) as revenue
from customer
join lineorder
on lo_custkey = c_customerkey
join supplier
on lo_suppkey = s_suppkey
join dim_date
on lo_orderdatekey = d_datekey
where c_nation = 'UNITED STATES'
and s_nation = 'UNITED STATES'
and d_year >= 1992
and d_year <= 1997
group by c_city, s_city, d_year
order by d_year asc, revenue desc;
-- Q3.3
select c_city, s_city, d_year, sum(lo_revenue) as revenue
from customer
join lineorder
on lo_custkey = c_customerkey
join supplier   on lo_suppkey = s_suppkey
join dim_date   on lo_orderdatekey = d_datekey
where (c_city='UNITED KI1' or c_city='UNITED KI5')
and (s_city='UNITED KI1' or s_city='UNITED KI5')
and d_year >= 1992
and d_year <= 1997
group by c_city, s_city, d_year
order by d_year asc, revenue desc;
-- Q3.4
select c_city, s_city, d_year, sum(lo_revenue)
as revenue
from customer
join lineorder
  on lo_custkey = c_customerkey
join supplier
  on lo_suppkey = s_suppkey
join dim_date
  on lo_orderdatekey = d_datekey
where
(c_city='UNITED KI1' or c_city='UNITED KI5')
and (s_city='UNITED KI1' or s_city='UNITED KI5')
and d_yearmonth = 'Dec1997'
group by c_city, s_city, d_year
order by d_year asc, revenue desc;
-- Q4.1
select d_year, c_nation,
sum(lo_revenue - lo_supplycost) as profit
from lineorder
join dim_date
  on lo_orderdatekey = d_datekey
join customer
  on lo_custkey = c_customerkey
join supplier
  on lo_suppkey = s_suppkey
join part
  on lo_partkey = p_partkey
where
c_region = 'AMERICA'
and s_region = 'AMERICA'
and (p_mfgr = 'MFGR#1'
or p_mfgr = 'MFGR#2')
group by d_year, c_nation
order by d_year, c_nation;
-- Q4.2
select d_year, s_nation, p_category,
sum(lo_revenue - lo_supplycost) as profit
from lineorder
join dim_date
  on lo_orderdatekey = d_datekey
join customer
  on lo_custkey = c_customerkey
join supplier
  on lo_suppkey = s_suppkey
join part
  on lo_partkey = p_partkey
where
c_region = 'AMERICA'
and s_region = 'AMERICA'
and (d_year = 1997 or d_year = 1998)
and (p_mfgr = 'MFGR#1'
or p_mfgr = 'MFGR#2')
group by d_year, s_nation, p_category
order by d_year, s_nation, p_category;
-- Q4.3
select d_year, s_city, p_brand,
sum(lo_revenue - lo_supplycost) as profit
from lineorder
join dim_date
  on lo_orderdatekey = d_datekey
join customer
  on lo_custkey = c_customerkey
join supplier
  on lo_suppkey = s_suppkey
join part
  on lo_partkey = p_partkey
where
s_nation = 'UNITED STATES'
and (d_year = 1997 or d_year = 1998)
and p_category = 'MFGR#14'
group by d_year, s_city, p_brand
order by d_year, s_city, p_brand;

And the schema:

DROP TABLE IF EXISTS customer;
CREATE TABLE IF NOT EXISTS customer
(
    C_CustomerKey int primary key,
    C_Name varchar(25),
    C_Address varchar(25),
    C_City varchar(10),
    C_Nation varchar(15),
    C_Region varchar(12),
    C_Phone varchar(15),
    C_MktSegment varchar(10),
    KEY(C_Name),
    KEY(C_City),
    KEY(C_Region),
    KEY(C_Phone),
    KEY(C_MktSegment)
);
DROP TABLE IF EXISTS part;
CREATE TABLE IF NOT EXISTS part
(
    P_PartKey int primary key,
    P_Name varchar(25),
    P_MFGR varchar(10),
    P_Category varchar(10),
    P_Brand varchar(15),
    P_Colour varchar(15),
    P_Type varchar(25),
    P_Size tinyint,
    P_Container char(10),
    key(P_Name),
    key(P_MFGR),
    key(P_Category),
    key(P_Brand)
);
DROP TABLE IF EXISTS supplier;
CREATE TABLE supplier
(
    S_SuppKey int primary key,
    S_Name char(25),
    S_Address varchar(25),
    S_City char(10),
    S_Nation char(15),
    S_Region char(12),
    S_Phone char(15),
    key(S_City),
    key(S_Name),
    key(S_Phone)
);
DROP TABLE IF EXISTS dim_date;
CREATE TABLE IF NOT EXISTS dim_date
(
    D_DateKey int primary key,
    D_Date char(18),
    D_DayOfWeek char(9),
    D_Month char(9),
    D_Year smallint,
    D_YearMonthNum int,
    D_YearMonth char(7),
    D_DayNumInWeek tinyint,
    D_DayNumInMonth tinyint,
    D_DayNumInYear smallint,
    D_MonthNumInYear tinyint,
    D_WeekNumInYear tinyint,
    D_SellingSeason char(12),
    D_LastDayInWeekFl tinyint,
    D_LastDayInMonthFl tinyint,
    D_HolidayFl tinyint,
    D_WeekDayFl tinyint
);
DROP TABLE IF EXISTS lineorder;
CREATE TABLE IF NOT EXISTS lineorder
(
    LO_OrderKey bigint not null,
    LO_LineNumber tinyint not null,
    LO_CustKey int not null,
    LO_PartKey int not null,
    LO_SuppKey int not null,
    LO_OrderDateKey int not null,
    LO_OrderPriority varchar(15),
    LO_ShipPriority char(1),
    LO_Quantity tinyint,
    LO_ExtendedPrice decimal,
    LO_OrdTotalPrice decimal,
    LO_Discount decimal,
    LO_Revenue decimal,
    LO_SupplyCost decimal,
    LO_Tax tinyint,
    LO_CommitDateKey int not null,
    LO_ShipMode varchar(10),
    KEY(LO_OrderKey, LO_LineNumber),
    KEY(LO_CustKey),
    KEY(LO_SuppKey),
    KEY(LO_PartKey),
    KEY(LO_OrderDateKey),
    KEY(LO_CommitDateKey)
);

The post MySQL 5.6 vs MySQL 5.5 and the Star Schema Benchmark appeared first on MySQL Performance Blog.