Extended keys: First in MariaDB 5.5, now in mysql-trunk, too

One of the optimizations we have introduced in MariaDB 5.5 is Extended keys. The idea behind it is rather simple. Inside InnoDB, every secondary index has an invisible suffix of primary key columns. That is, when you create an index:

ALTER TABLE innodb_tbl ADD INDEX (column1, column2);

you’re actually creating this

ALTER TABLE innodb_tbl ADD INDEX (column1, column2, primary_key_column1, …, primary_key_columnN);

The index is extended with primary key columns. SHOW KEYS does not show these extra key parts, but they are there.

Traditionally, MySQL optimizer was half-aware of these extra columns. It knew that doing an index-only scan on InnoDB’s secondary key would read the primary key columns also, and used this property. On the other hand, the optimizer was not able to use the extra columns to do a ref or range access, or to resolve an ORDER BY clause (correction: the optimizer did take extra columns into account when resolving ORDER BY clauses). If you had a statement like:

SELECT … FROM innodb_tbl WHERE column1=’foo’ AND ‘column2=’bar’ AND primary_key_column1=’baz’ 

the optimizer was only able to use two key parts for ref access. Extended keys optimization in MariaDB 5.5 removed this kind of limitations. MariaDB 5.5 is able to use the extra index columns for any purpose MySQL/MariaDB can use an index for.

So, what’s the news? The news is this commit. It seems, Oracle has decided to follow and also support extended keys. The email subject is “bzr push into mysql-trunk“, I suppose it means that the feature has been pushed into whatever will be the next version after MySQL 5.6 (mysql-trunk tree is not publicly available, so there’s a lot of guessing).

Looking at their patch, I see two differences from MariaDB’s version:

  1. optimizer_switch flag is named extended_keys in MariaDB and extended_secondary_keys in MySQL-trunk
  2. they inform the optimizer that the extended key is a UNIQUE key

The first one is trivial. The second one is a bit puzzling. It is true that primary key columns are unique, hence adding them to a secondary index makes the extended secondary index unique, too. But what is the benefit of knowing this? Index Unique-ness can only be used when you’ve got values for each of the columns. But if you’ve got values for each of extended key columns (column1, column2, primary_key_column1, …, primary_key_columnN), you’ve also got value for each of primary key columns (primary_key_column1, …, primary_key_columnN). Why would you want to use the secondary index then? The primary key is also usable, and it is shorter.

It would be nice to know what Sergey Glukhov (author of Oracle’s implementation) had in mind when he was coding this. Or, we’ll have to figure on our own, and add the extended-key-is-unique feature to MariaDB’s implementation.