Filesorts, Secondary Indexes and the Importance of Covering Indexes


Here's a question that was driving me crazy: Why do these two explain plans look different?

explain select a from test order by b;
+----+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | test  | index | NULL          | b    | 5       | NULL | 3263769 | Using index |
+----+-------------+-------+-------+---------------+------+---------+------+---------+-------------+


and this

explain select a,c from test order by b;
+----+-------------+-------+------+---------------+------+---------+------+---------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows    | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+---------+----------------+
|  1 | SIMPLE      | test  | ALL  | NULL          | NULL | NULL    | NULL | 3263769 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+---------+----------------+


The second query does a filesort, but the only change is adding another column to the SELECT clause!

For reference, here is the table structure. As you can see, there's a primary key with auto increment, and a secondary key on b.



CREATE TABLE `test` (
  `a` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `b` int(11) DEFAULT NULL,
  `c` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`a`),
  KEY `b` (`b`)
) ENGINE=InnoDB AUTO_INCREMENT=3278870 DEFAULT CHARSET=latin1


You don't get the same behavior if you're ordering by the primary key value. So it seems there's a big difference when sorting if you use a secondary index.

 The explanation seems to be that (with innodb tables) if you order by a secondary index, and you need to read row data, then mysql has to go back to read the clustered index anyway, so it just ignores the secondary index and does a filesort.

Wow, that makes optimization of sorts much more difficult! What's the solution? I've heard that Multi-Range-Read in MySQL 5.6 will fix this, but I haven't tested that myself yet.

For now, a query like this highlights the importance of covering indexes.



alter table test add index (b,c);

explain select a,c from test order by b;
+----+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | test  | index | NULL          | b_2  | 40      | NULL | 3263685 | Using index |
+----+-------------+-------+-------+---------------+------+---------+------+---------+-------------+