At the end of my talk “Troubleshooting MySQL Performance” at the LinuxPiter conference, a user asked me a question: “What does the EXPLAIN ‘filtered’ field mean, and how do I use it?” I explained that this is the percentage of rows that were actually needed, against the equal or bigger number of resolved rows. While the user was happy with the answer, I’d like to better illustrate this. And I can do it with help of EXPLAIN FORMAT=JSON and its
rows_examined_per_scan, rows_produced_per_join
statistics.
Let’s take a simple query that searches information about the Russian Federation in the table
Country
of the standard world database:
mysql> select * from Country where Name='Russian Federation'G *************************** 1. row *************************** Code: RUS Name: Russian Federation Continent: Europe Region: Eastern Europe SurfaceArea: 17075400.00 IndepYear: 1991 Population: 146934000 LifeExpectancy: 67.2 GNP: 276608.00 GNPOld: 442989.00 LocalName: Rossija GovernmentForm: Federal Republic HeadOfState: Vladimir Putin Capital: 3580 Code2: RU 1 row in set (0.00 sec)
It returns single row – but how many rows were actually used to resolve the query?
EXPLAIN
will show us:
mysql> 56-explain select * from Country where Name='Russian Federation'; +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | Country | NULL | ALL | NULL | NULL | NULL | NULL | 239 | 10.00 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec) Note (Code 1003): /* select#1 */ select `world`.`Country`.`Code` AS `Code`,`world`.`Country`.`Name` AS `Name`,`world`.`Country`.`Continent` AS `Continent`,`world`.`Country`.`Region` AS `Region`,`world`.`Country`.`SurfaceArea` AS `SurfaceArea`,`world`.`Country`.`IndepYear` AS `IndepYear`,`world`.`Country`.`Population` AS `Population`,`world`.`Country`.`LifeExpectancy` AS `LifeExpectancy`,`world`.`Country`.`GNP` AS `GNP`,`world`.`Country`.`GNPOld` AS `GNPOld`,`world`.`Country`.`LocalName` AS `LocalName`,`world`.`Country`.`GovernmentForm` AS `GovernmentForm`,`world`.`Country`.`HeadOfState` AS `HeadOfState`,`world`.`Country`.`Capital` AS `Capital`,`world`.`Country`.`Code2` AS `Code2` from `world`.`Country` where (`world`.`Country`.`Name` = 'Russian Federation')
You see that 239 rows were examined, and 10% of them filtered. But what exactly was done? An explanation exists in the
EXPLAIN FORMAT=JSON
output:
mysql> explain format=json select * from Country where Name='Russian Federation'G *************************** 1. row *************************** EXPLAIN: { "query_block": { "select_id": 1, "cost_info": { "query_cost": "53.80" }, "table": { "table_name": "Country", "access_type": "ALL", "rows_examined_per_scan": 239, "rows_produced_per_join": 23, "filtered": "10.00", "cost_info": { "read_cost": "49.02", "eval_cost": "4.78", "prefix_cost": "53.80", "data_read_per_join": "6K" }, "used_columns": [ "Code", "Name", "Continent", "Region", "SurfaceArea", "IndepYear", "Population", "LifeExpectancy", "GNP", "GNPOld", "LocalName", "GovernmentForm", "HeadOfState", "Capital", "Code2" ], "attached_condition": "(`world`.`Country`.`Name` = 'Russian Federation')" } } } 1 row in set, 1 warning (0.00 sec) Note (Code 1003): /* select#1 */ select `world`.`Country`.`Code` AS `Code`,`world`.`Country`.`Name` AS `Name`,`world`.`Country`.`Continent` AS `Continent`,`world`.`Country`.`Region` AS `Region`,`world`.`Country`.`SurfaceArea` AS `SurfaceArea`,`world`.`Country`.`IndepYear` AS `IndepYear`,`world`.`Country`.`Population` AS `Population`,`world`.`Country`.`LifeExpectancy` AS `LifeExpectancy`,`world`.`Country`.`GNP` AS `GNP`,`world`.`Country`.`GNPOld` AS `GNPOld`,`world`.`Country`.`LocalName` AS `LocalName`,`world`.`Country`.`GovernmentForm` AS `GovernmentForm`,`world`.`Country`.`HeadOfState` AS `HeadOfState`,`world`.`Country`.`Capital` AS `Capital`,`world`.`Country`.`Code2` AS `Code2` from `world`.`Country` where (`world`.`Country`.`Name` = 'Russian Federation')
We are interested in this part:
"rows_examined_per_scan": 239, "rows_produced_per_join": 23, "filtered": "10.00",
It clearly shows that 239 rows were examined, but only 23 rows were used to produce the result. To make this query more effective we need to add an index on the
Name
field:
mysql> alter table Country add index(Name); Query OK, 0 rows affected (0.40 sec) Records: 0 Duplicates: 0 Warnings: 0
Now the
EXPLAIN
plan is much better: we only examine 1 required row, and the value of filtered is 100%:
mysql> pager egrep 'rows_examined_per_scan|rows_produced_per_join|filtered'; PAGER set to 'egrep 'rows_examined_per_scan|rows_produced_per_join|filtered'' mysql> explain format=json select * from Country where Name='Russian Federation'G "rows_examined_per_scan": 1, "rows_produced_per_join": 1, "filtered": "100.00", 1 row in set, 1 warning (0.00 sec)
The post rows_examined_per_scan, rows_produced_per_join: EXPLAIN FORMAT=JSON answers on question “What number of filtered rows mean?” appeared first on MySQL Performance Blog.