Cost-based choice between subquery materialization and EXISTS

In a previous post, I had demonstrated how subquery materialization, introduced in MySQL 5.6.5, improves the performance of certain queries, like query Q16 of DBT3. Such improvement was easily explained:

  • Subquery materialization has a high start up cost (it needs to create and fill the temporary table).
  • But afterwards it has fast lookups (temporary table has a hash index, no duplicates, and is in memory).
  • In other words, compared to EXISTS, the first evaluation of the IN predicate is slow (high start up cost) and all following evaluations are fast (just a hash lookup).
  • In the DBT 3 setup, one outer table (named "part") has 200,000 rows, so there are 200,000 evaluations of IN, thus subquery materialization wins over EXISTS because the time it loses in the first evaluation is more than compensated by the many faster following evaluations.

However, if there were only few outer rows, then subquery materialization should logically be slower than EXISTS (the compensation would not happen anymore)... MySQL 5.6.5, by blindly always choosing subquery materialization, takes the risk of making certain queries slower. There needs to be a cost-based choice between the two strategies, to pick the best, depending on the situation! That is what I have implemented in MySQL 5.6.7.

To show it in action, I will use query Q16 again. First I will run it with the normal "part" table which has 200,000 rows. Then I will reduce this table to only 200 rows, and run the query again. Each time, I will run EXPLAIN to see what subquery strategy is chosen by the optimizer. I will also, by tweaking the optimizer_switch variable, force the optimizer to use the other strategy which it didn't like, in order to verify that it is indeed worse.

For brevity, let me jump directly to the results, obtained with a release build of MySQL 5.6.7 on my machine:

Rows in part Optimizer chooses Execution time If I force alternative
200,000 Materialization 550 ms 830 ms
200 EXISTS 1 ms 10 ms


We can see that in both cases the optimizer has made the right choice!