Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский
InnoDB: look after fragmentation
+5 Vote Up -0 Vote Down

One problem made me puzzled for couple hours, but it was really interesting to figure out what's going on.

So let me introduce problem at first. The table is

PLAIN TEXT
CODE:
  1. CREATE TABLE `c` (
  2.   `tracker_id` int(10) unsigned NOT NULL,
  3.   `username` char(20) character set latin1 collate latin1_bin NOT NULL,
  4.   `time_id` date NOT NULL,
  5.   `block_id` int(10) unsigned default NULL,
  6.   PRIMARY KEY  (`tracker_id`,`username`,`time_id`),
  7.   KEY `block_id` (`block_id`)
  8. ) ENGINE=InnoDB

Table has 11864696 rows and takes Data_length: 698,351,616 bytes on disk

The problem is that after restoring table from mysqldump, the query that scans data by primary key was slow. How slow ? Let me show.

The query in question is (Q1):

SELECT count(distinct username) FROM tracker where TIME_ID >= '2009-07-20 00:00:00' AND TIME_ID <= '2009-10-21 00:00:00' AND (tracker_id=437)

On cold buffer_pool, it took:

PLAIN TEXT
CODE:
  1. +---------------------------+
  2. | count(distinct username) |
  3. +---------------------------+
  4. |                   5856156 |
  5. +---------------------------+
  6. 1 row in set (4 min 13.61 sec)

However the query (again on cold buffer_pool) (Q2)

SELECT count(distinct username) FROM tracker where TIME_ID >= '2009-07-20 00:00:00' AND TIME_ID <= '2009-10-21 00:00:00'

PLAIN TEXT
CODE:
  1. +---------------------------+
  2. | count(distinct username) |
  3. +---------------------------+
  4. |                   5903053 |
  5. +---------------------------+
  6. 1 row in set (18.81 sec)

Difference is impressive. 4 min 13.61 sec vs 18.81 sec

If you want EXPLAIN plain, here it is:

For Q1:

PLAIN TEXT
CODE:
  1. +----+-------------+-------------------------+------+---------------+---------+---------+-------+---------+--------------------------+
  2. | id | select_type | table                   | type | possible_keys | key     | key_len | ref   | rows    | Extra                    |
  3. +----+-------------+-------------------------+------+---------------+---------+---------+-------+---------+--------------------------+
  4. 1 | SIMPLE      | tracker  | ref  | PRIMARY       | PRIMARY | 4       | const | 6880241 | Using where; Using index |
  5. +----+-------------+-------------------------+------+---------------+---------+---------+-------+---------+--------------------------+
  6. 1 row in set (0.02 sec)

For Q2:

PLAIN TEXT
CODE:
  1. +----+-------------+-------------------------+-------+---------------+-------------------------------------+---------+------+----------+--------------------------+
  2. | id | select_type | table                   | type  | possible_keys | key                                 | key_len | ref  | rows     | Extra                    |
  3. +----+-------------+-------------------------+-------+---------------+-------------------------------------+---------+------+----------+--------------------------+
  4. 1 | SIMPLE      | tracker | index | NULL          | block_id | 5       | NULL | 13760483 | Using where; Using index |
  5. +----+-------------+-------------------------+-------+---------------+-------------------------------------+---------+------+----------+--------------------------+

Query Q1 is executed using Primary Key, and Query Q2 is using block_id key.

To get more details I ran both queries with our extended stats in slow.log (available in 5.0-percona releases)

So for query Q1:

PLAIN TEXT
CODE:
  1. # Query_time: 253.643162  Lock_time: 0.000137  Rows_sent: 1  Rows_examined: 11569733  Rows_affected: 0  Rows_read: 11569733
  2. #   InnoDB_IO_r_ops: 73916  InnoDB_IO_r_bytes: 1211039744  InnoDB_IO_r_wait: 236.149003
  3. #   InnoDB_rec_lock_wait: 0.000000  InnoDB_queue_wait: 0.000000
  4. #   InnoDB_pages_distinct: 54838

And for query Q2:

PLAIN TEXT
CODE:
  1. # Query_time: 18.846855  Lock_time: 0.000123  Rows_sent: 1  Rows_examined: 11864696  Rows_affected: 0  Rows_read: 11864696
  2. #   InnoDB_IO_r_ops: 27510  InnoDB_IO_r_bytes: 450723840  InnoDB_IO_r_wait: 0.165124
  3. #   InnoDB_rec_lock_wait: 0.000000  InnoDB_queue_wait: 0.000000
  4. #   InnoDB_pages_distinct: 24687

As you see for Q1 IO read took 236.149003 sec vs 0.165124 for Q2. But Q1 is scan by primary key, which supposed to be
sequential!

Let's see on another statistic, which available in innodb_check_fragmentation patch:

for Q1:

PLAIN TEXT
CODE:
  1. SHOW STATUS LIKE 'Innodb_scan_pages%';
  2. +------------------------------+-------+
  3. | Variable_name                | Value |
  4. +------------------------------+-------+
  5. | Innodb_scan_pages_contiguous | 88    |
  6. | Innodb_scan_pages_jumpy      | 73789 |
  7. +------------------------------+-------+
  8. 2 rows in set (0.00 sec)

and for Q2:

PLAIN TEXT
CODE:
  1. mysql> SHOW STATUS LIKE 'Innodb_scan_pages%';       
  2. +------------------------------+-------+
  3. | Variable_name                | Value |
  4. +------------------------------+-------+
  5. | Innodb_scan_pages_contiguous | 26959 |
  6. | Innodb_scan_pages_jumpy      | 442   |
  7. +------------------------------+-------+
  8. 2 rows in set (0.00 sec)

So you see for Q1 it was not sequential scan, even it is primary key, but it is sequential for Q2.

So what's the answer ? It's fragmentation of primary key (and whole data table, as InnoDB data == primary key). But how it could happen with
primary key after mysqldump ? The answer here if we look on

EXPLAIN SELECT * FROM tracker;

PLAIN TEXT
CODE:
  1. +----+-------------+-------------------------+-------+---------------+-------------------------------------+---------+------+----------+-------------+
  2. | id | select_type | table                   | type  | possible_keys | key                                 | key_len | ref  | rows     | Extra       |
  3. +----+-------------+-------------------------+-------+---------------+-------------------------------------+---------+------+----------+-------------+
  4. 1 | SIMPLE      | tracker | index | NULL          | block_id | 5       | NULL | 13760483 | Using index |
  5. +----+-------------+-------------------------+-------+---------------+-------------------------------------+---------+------+----------+-------------+
  6. 1 row in set (0.00 sec)

We see that dump is taken in key "block_id" order, not in primary key order. And later when we load this table, INSERTS into primary key happens in random order, and that gives us the fragmentation we see here.

How to fix it in our case. It's easy: ALTER TABLE tracker ENGINE=InnoDB, it will force InnoDB to rebuild table in primary key order.

After that Q1:

PLAIN TEXT
CODE:
  1. +---------------------------+
  2. | count(distinct username) |
  3. +---------------------------+
  4. |                   5856156 |
  5. +---------------------------+
  6. 1 row in set (17.72 sec)
  7.  
  8. mysql> SHOW STATUS LIKE 'Innodb_scan_pages%';
  9. +------------------------------+-------+
  10. | Variable_name                | Value |
  11. +------------------------------+-------+
  12. | Innodb_scan_pages_contiguous | 37864 |
  13. | Innodb_scan_pages_jumpy      | 574   |
  14. +------------------------------+-------+
  15. 2 rows in set (0.00 sec)
  16.  
  17. and extended stats:
  18. # Query_time: 17.765369  Lock_time: 0.000137  Rows_sent: 1  Rows_examined: 11569733  Rows_affected: 0  Rows_read: 11569733
  19. #   InnoDB_IO_r_ops: 38530  InnoDB_IO_r_bytes: 631275520  InnoDB_IO_r_wait: 0.204893
  20. #   InnoDB_rec_lock_wait: 0.000000  InnoDB_queue_wait: 0.000000
  21. #   InnoDB_pages_distinct: 35584

You see that time returned to appropriate 17.72 sec.

You may ask what happens now with Q2 ? yes, it's getting slow now, as we made key "block_id" inserted not in order.

PLAIN TEXT
CODE:
  1. +---------------------------+
  2. | count(distinct username) |
  3. +---------------------------+
  4. |                   5903053 |
  5. +---------------------------+
  6. 1 row in set (2 min 8.92 sec)
  7. mysql> SHOW STATUS LIKE 'Innodb_scan_pages%';
  8. +------------------------------+-------+
  9. | Variable_name                | Value |
  10. +------------------------------+-------+
  11. | Innodb_scan_pages_contiguous | 45    |
  12. | Innodb_scan_pages_jumpy      | 35904 |
  13. +------------------------------+-------+
  14. 2 rows in set (0.00 sec)

As for mysqldump you may use --order-by-primary options to force dump in primary key order.

So notes to highlight:

  • InnoDB fragmentation may hurt your query significantly, especially when data is not in buffer_pool and execution goes to read from disk
  • Fragmentation by secondary key is much more likely than by primary key, and you cannot really control it (tough it is possible in XtraDB / InnoDB-plugin with FAST INDEX creation) so be careful with queries scan many records by secondary key
  • To check if you query affected by fragmentation you can use Innodb_scan_pages_contiguous ; Innodb_scan_pages_jumpy statistics in 5.0-percona releases (coming to 5.1-XtraDB soon)

Entry posted by Vadim | 7 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Votes:

You must be logged in with a MySQL.com account to vote on Planet MySQL entries. More information on PlanetMySQL voting.

Planet MySQL © 1995-2008 MySQL AB, 2008-2009 Sun Microsystems, Inc.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Sun Microsystems, Inc. and does not
necessarily represent the opinion of Sun Microsystem, Inc. or any other party.