Showing entries 51 to 60 of 110
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: optimizer (reset)
The MySQL Optimizer Cost Model Project

You may not be aware of this but the foundation that the MySQL optimizer builds on when choosing a query plan – the cost model – is for the most part very old. At least in tech terms.

Much of it was written in another millennium, at a time when “Forest Gump” and “Titanic” won Oscars and “Baywatch” was the big thing on TV. Although the revision history doesn’t go that far back, it wouldn’t come as a surprise if it predates that annoying “Macarena” song and even “The Sign” (Ace Of Base) – don’t follow those links unless you’re feeling very brave…

Thankfully, a lot has happened since Ace of Base. Modern CPUs have …

[Read more]
A new dimension to MySQL query optimizations – part 2

This is “A new dimension to MySQL query optimizations – part 2″. If you didn’t read part 1 already I suggest you skim through it before reading on.

To summarize, the problem at hand is this: Given a query with a join between two or more tables, the MySQL optimizer’s mission is to find the best combination of join order and access method so that the response time becomes as low as possible. The optimizer does this by calculating the cost of each combination and then picks the cheapest one.

Consider the following query:

SELECT *
FROM employee JOIN department ON employee.dept_no=department.dept_no
WHERE employee.first_name="John" AND
      employee.hire_date BETWEEN "2012-01-01" AND "2012-06-01"

The optimizer will calculate the cost of the alternative plans as follows:

total cost = …
[Read more]
A new dimension to MySQL query optimizations – part 1

It’s not radical to claim that one of the most important tasks of any DBMS query optimizer is to find the best join order for the tables participating in a query. Is it, e.g., better to read country or city first in

SELECT * 
FROM country JOIN city ON country.id=city.countryid 
WHERE city.population > 1000000 AND 
      country.region="EMEA"

employee or department first in

SELECT *
FROM employee JOIN department ON employee.dept_no=department.dept_no
WHERE employee.first_name="John" AND
      employee.hire_date BETWEEN "2012-01-01" AND "2012-06-01" AND
      department.location="Paris"

If the optimizer gets this wrong, the resulting response time may be disastrous (or hilarious, depending on your sense of humour).

Simply put (and ignoring some edge cases), the MySQL optimizer does the following to find the cheapest combination of access methods and join order for the second query above:

Calculate …

[Read more]
You asked for it: new default for eq_range_index_dive_limit

In MySQL 5.6 we made it possible to estimate the number of fetched rows for queries with many equality predicates (e.g., many values in the IN clause). The new way of estimating fetched rows kicks in if the number of equality predicates exceeds the value of eq_range_index_dive_limit, which has a default of 10.

The good thing about the new way of estimating number of rows is that it is faster than the old way (index dives). On the other hand it is also less accurate.

As it happens, we got push back from the community that the default value should have been higher. Among others we heard this:

“eq_range_index_dive_limit is helpful to reduce index dive costs on …

[Read more]
The range access method and why you should use EXPLAIN JSON

I got an interesting question about EXPLAIN and the range access method recently. The person had a query that could be written either with a BETWEEN predicate or an IN predicate, something similar to this:

mysql> EXPLAIN SELECT * 
-> FROM orders WHERE customer_id BETWEEN 7 AND 10 AND value > 500;
+----+-------------+--------+-------+----------+----------+------+------
| id | select_type | table | type | key | key_len | rows | Extra
+----+-------------+--------+-------+----------+----------+------+------
| 1 | SIMPLE | orders | range | cust_val | 10 | 91 | ...
+----+-------------+--------+-------+----------+----------+------+------

mysql> EXPLAIN SELECT *
-> FROM orders WHERE customer_id IN (7,8,9,10) AND value > 500;
+----+-------------+--------+-------+----------+----------+------+------
| id | select_type | table | type | key | key_len | rows …
[Read more]
Many-table joins in MySQL 5.6

I recently worked on an uncommon slow query: less than 100 rows were read and returned, the whole dataset was fitting in memory but the query took several seconds to run. Long story short: the query was a join involving 21 tables, running on MySQL 5.1. But by default MySQL 5.1 is not good at handling joins with such a large number of tables. The good news is that MySQL 5.6 brings welcome improvements.

Isolating the problem

As always with a slow query, finding the execution plan with EXPLAIN is the 1st step to understand where time is spent. Here the plan was very good with almost all joins using the primary key or a unique key, but perhaps the most interesting part was that EXPLAIN was very slow as well. This indicates that the optimizer takes a lot of time finding the optimal execution plan. To double check, we can run SHOW PROFILE:

mysql> set @@profiling = 1;
mysql> SELECT …
[Read more]
Filesort optimization in 5.7.3: pack values in the sort buffer

In MySQL, filesort is the catch-all algorithm for producing sorted results for ORDER-BY or GROUP-BY queries. MySQL has two algorithms for filesort, both the original and the modified algorithms are described in the user manual.  The most commonly used algorithm is the so called modified algorithm, it is used for all cases except when BLOB and TEXT column are involved.

In 5.7.3, Tor Didriksen in the optimizer team introduced one more optimization that applies to the modified algorithm.  Let us first take a look at how MySQL´s modified filesort algorithm worked up to 5.7.2.

  1. Read the rows that match the WHERE clause.
  2. For each row, record a tuple of values consisting of the sort key value and the additional fields referenced by the query.
  3. When the sort buffer becomes full, sort the tuples …
[Read more]
State of the UNION

Since union queries were introduced in MySQL 4.0.0, they have been processed by executing each part of the union and adding the result to a temporary table. Then a final query block is executed that, if needed, filters out duplicates, do global ordering and limit the number of output rows. The good thing about this method is that it works for all union queries. The bad thing is that it always uses a temporary table, even in the really simple cases, and there’s always an extra query block to execute, even if there’s no sorting and no duplicate filtering.

As of MySQL 5.7.3, UNION ALL doesn’t use temporary tables unless needed for sorting. The result of each part of the union will be sent directly back to the client, without waiting in a temporary table or passing through an extra query block. When 5.7.3 came out, Facebook (who filed the feature request for …

[Read more]
Multiple column index vs multiple indexes with MySQL 5.6

A question often comes when talking about indexing: should we use multiple column indexes or multiple indexes on single columns? Peter Zaitsev wrote about it back in 2008 and the conclusion then was that a multiple column index is most often the best solution. But with all the recent optimizer improvements, is there anything different with MySQL 5.6?

Setup

For this test, we will use these 2 tables (same structure as in Peter’s post):

CREATE TABLE t1000merge (
  id int not null auto_increment primary key,
  i int(11) NOT NULL,
  j int(11) NOT NULL,
  val char(10) NOT NULL,
  KEY i (i),
  KEY j (j)
) ENGINE=InnoDB;
CREATE TABLE t1000idx2 (
  id int not null auto_increment primary key,
  i int(11) NOT NULL,
  j int(11) NOT NULL,
  val char(10) NOT NULL,
  KEY ij (i,j)
) ENGINE=InnoDB;

Tables were …

[Read more]
Range access: now in an IN predicate near you.

Several users have reported that certain queries with IN predicates can’t use index scans even though all the columns in the query are indexed. What’s worse, if you reformulate your query without IN, the indexes are used. Let’s take some example query. Suppose we have a table with two indexed columns:CREATE TABLE t1 ( 
  col1 INTEGER,
  col2 INTEGER,
  …   KEY key1( col1, col2 ) ); Let’s take a look at some queries that could take advantage of the key1 index to read rows without accessing the table.

  1. SELECT col1, col2 FROM t1 WHERE col1 = 100;
  2. SELECT col1, col2 FROM t1 WHERE col1 > 100 AND col1 < 200;
  3. SELECT col1, col2 FROM t1 WHERE col1 > 100 AND col1 < 200 OR col1 > 300 AND col1 < 400;
  4. SELECT col1, col2 FROM t1 WHERE col1 = 100 AND col2 > 100 AND cold2 < 200;

These queries will use what MySQL …

[Read more]
Showing entries 51 to 60 of 110
« 10 Newer Entries | 10 Older Entries »