rows_examined_per_scan, rows_produced_per_join: EXPLAIN FORMAT=JSON answers on question “What number of filtered rows mean?”

At the end of my talk “Troubleshooting MySQL Performance” at the LinuxPiter conference, a user asked me a question: “What does the EXPLAIN ‘filtered’ field mean, and how do I use it?” I explained that this is the percentage of rows that were actually needed, against the equal or bigger number of resolved rows. While the user was happy with the answer, I’d like to better illustrate this. And I can do it with help of EXPLAIN FORMAT=JSON and its

rows_examined_per_scan, rows_produced_per_join

  statistics.

Let’s take a simple query that searches information about the Russian Federation in the table

Country

  of the standard world database:

mysql> select * from Country where Name='Russian Federation'G
*************************** 1. row ***************************
          Code: RUS
          Name: Russian Federation
     Continent: Europe
        Region: Eastern Europe
   SurfaceArea: 17075400.00
     IndepYear: 1991
    Population: 146934000
LifeExpectancy: 67.2
           GNP: 276608.00
        GNPOld: 442989.00
     LocalName: Rossija
GovernmentForm: Federal Republic
   HeadOfState: Vladimir Putin
       Capital: 3580
         Code2: RU
1 row in set (0.00 sec)

It returns single row – but how many rows were actually used to resolve the query?

EXPLAIN

  will show us:

mysql> 56-explain select * from Country where Name='Russian Federation';
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | Country | NULL       | ALL  | NULL          | NULL | NULL    | NULL |  239 |    10.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `world`.`Country`.`Code` AS `Code`,`world`.`Country`.`Name` AS `Name`,`world`.`Country`.`Continent` AS `Continent`,`world`.`Country`.`Region` AS `Region`,`world`.`Country`.`SurfaceArea` AS `SurfaceArea`,`world`.`Country`.`IndepYear` AS `IndepYear`,`world`.`Country`.`Population` AS `Population`,`world`.`Country`.`LifeExpectancy` AS `LifeExpectancy`,`world`.`Country`.`GNP` AS `GNP`,`world`.`Country`.`GNPOld` AS `GNPOld`,`world`.`Country`.`LocalName` AS `LocalName`,`world`.`Country`.`GovernmentForm` AS `GovernmentForm`,`world`.`Country`.`HeadOfState` AS `HeadOfState`,`world`.`Country`.`Capital` AS `Capital`,`world`.`Country`.`Code2` AS `Code2` from `world`.`Country` where (`world`.`Country`.`Name` = 'Russian Federation')

You see that 239 rows were examined, and 10% of them filtered. But what exactly was done? An explanation exists in the 

EXPLAIN FORMAT=JSON

  output:

mysql> explain format=json select * from Country where Name='Russian Federation'G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "53.80"
    },
    "table": {
      "table_name": "Country",
      "access_type": "ALL",
      "rows_examined_per_scan": 239,
      "rows_produced_per_join": 23,
      "filtered": "10.00",
      "cost_info": {
        "read_cost": "49.02",
        "eval_cost": "4.78",
        "prefix_cost": "53.80",
        "data_read_per_join": "6K"
      },
      "used_columns": [
        "Code",
        "Name",
        "Continent",
        "Region",
        "SurfaceArea",
        "IndepYear",
        "Population",
        "LifeExpectancy",
        "GNP",
        "GNPOld",
        "LocalName",
        "GovernmentForm",
        "HeadOfState",
        "Capital",
        "Code2"
      ],
      "attached_condition": "(`world`.`Country`.`Name` = 'Russian Federation')"
    }
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `world`.`Country`.`Code` AS `Code`,`world`.`Country`.`Name` AS `Name`,`world`.`Country`.`Continent` AS `Continent`,`world`.`Country`.`Region` AS `Region`,`world`.`Country`.`SurfaceArea` AS `SurfaceArea`,`world`.`Country`.`IndepYear` AS `IndepYear`,`world`.`Country`.`Population` AS `Population`,`world`.`Country`.`LifeExpectancy` AS `LifeExpectancy`,`world`.`Country`.`GNP` AS `GNP`,`world`.`Country`.`GNPOld` AS `GNPOld`,`world`.`Country`.`LocalName` AS `LocalName`,`world`.`Country`.`GovernmentForm` AS `GovernmentForm`,`world`.`Country`.`HeadOfState` AS `HeadOfState`,`world`.`Country`.`Capital` AS `Capital`,`world`.`Country`.`Code2` AS `Code2` from `world`.`Country` where (`world`.`Country`.`Name` = 'Russian Federation')

We are interested in this part:

"rows_examined_per_scan": 239,
      "rows_produced_per_join": 23,
      "filtered": "10.00",

It clearly shows that 239 rows were examined, but only 23 rows were used to produce the result. To make this query more effective we need to add an index on the 

Name

 field:

mysql> alter table Country add index(Name);
Query OK, 0 rows affected (0.40 sec)
Records: 0 Duplicates: 0 Warnings: 0

Now the 

EXPLAIN

  plan is much better: we only examine 1 required row, and the value of filtered is 100%:

mysql> pager egrep 'rows_examined_per_scan|rows_produced_per_join|filtered';
PAGER set to 'egrep 'rows_examined_per_scan|rows_produced_per_join|filtered''
mysql> explain format=json select * from Country where Name='Russian Federation'G
      "rows_examined_per_scan": 1,
      "rows_produced_per_join": 1,
      "filtered": "100.00",
1 row in set, 1 warning (0.00 sec)

 

The post rows_examined_per_scan, rows_produced_per_join: EXPLAIN FORMAT=JSON answers on question “What number of filtered rows mean?” appeared first on MySQL Performance Blog.