A few customers with rather extreme needs have contacted us about
a performance issue with the range optimizer. Our solution to the
problem is to introduce a new variable in MySQL 5.6,
eq_range_index_dive_limit, which can be used to control whether
or not the range optimizer will a) do index dives, or b) use
index statistics when estimating the number of rows in the ranges
of the query. The former method gives a far more accurate
estimate while the latter costs a lot less to compute.
This is what the help text has to tell about the variable:
The optimizer will use existing index statistics instead of doing
index dives for equality ranges if the number of equality ranges
for the index is larger than or equal to [the value of variable].
If set to 0, index dives are always used."Equality range" means
predicates using operators IN() or =, and it's important to
notice that the number of such ranges is counted on a per index …
Over the last few weeks I’ve been looking at several customers’
slow query logs, and I found in many of them an odd type of
query. These are SELECT
statements that contain an
IN
clause that includes dozens, sometimes hundreds
of values. These statements often end in the slow query log. I’m
not sure if these queries are this way by design or if they are
generated by a specific database development tool.
I did some tests in one of my own databases, one with only around 10K rows in its largest table. The database corresponds to the Amarok media player. For example, I queried for songs by B. B. King (spelled “BB King”, “B.B. King”, etc. or with other artists: “B. B. King & Eric Clapton”).
The first query used a JOIN
and an IN
clause with all the …