How FLUSH TABLES WITH READ LOCK works with Innodb Tables

Many backup tools including Percona Xtrabackup, MyLVMBackup and others use FLUSH TABLES WITH READ LOCK to temporary make MySQL read only. In many cases the period for which server has to be made read only is very short, just few seconds, yet the impact of FLUSH TABLES WITH READ LOCK can be quite large because of the time it may take to complete this statement. Lets look at what the problem is.

As of MySQL 5.5 FLUSH TABLES WITH READ LOCK does not work as optimally as you could think it works. Even though with general lock compatibility guidelines Read Lock should not conflict with another Read Lock, it does for this statement, and as such it has to wait for any SELECT statement to complete in order to complete locking tables. This means if you have workload which includes some very long SELECT queries you can be potentially waiting for hours for this statement to complete. Here is example how it can look:

mysql> show processlist;
+-------+------+-----------+----------+---------+------+-------------------------+-----------------------------+-----------+---------------+-----------+
| Id    | User | Host      | db       | Command | Time | State                   | Info                        | Rows_sent | Rows_examined | Rows_read |
+-------+------+-----------+----------+---------+------+-------------------------+-----------------------------+-----------+---------------+-----------+
| 10219 | root | localhost | dumptest | Query   |  324 | Sending data            | select count(*) from A,B    |         0 |             0 |   2359297 |
| 10290 | root | localhost | NULL     | Query   |  317 | Waiting for table flush | flush tables with read lock |         0 |             0 |         1 |
| 10291 | root | localhost | dumptest | Query   |    0 | NULL                    | show processlist            |         0 |             0 |         2 |
+-------+------+-----------+----------+---------+------+-------------------------+-----------------------------+-----------+---------------+-----------+
3 rows in set (0.00 sec)

As you can see FLUSH TABLES WITH READ LOCK is waiting for that very nasty “full join” select to complete. What is worse as the statement started execution all writes will be blocked to the server, which in the end causes “field list” operation to be blocked too, which among other tools used by mysql client, so you might get a feel you can’t connect to the database at all (use mysql -A in this case, to prevent it from getting field list from all tables)

Reads though would not be blocked, ie there is no MYISAM like table lock priority problem with pending WRITE query blocks any READ queries to execute on the table.

mysql> show processlist;
+-------+------+-----------+----------+------------+------+------------------------------+------------------------------+-----------+---------------+-----------+
| Id    | User | Host      | db       | Command    | Time | State                        | Info                         | Rows_sent | Rows_examined | Rows_read |
+-------+------+-----------+----------+------------+------+------------------------------+------------------------------+-----------+---------------+-----------+
| 10219 | root | localhost | dumptest | Query      |  688 | Sending data                 | select count(*) from A,B     |         0 |             0 |   2359297 |
| 10290 | root | localhost | NULL     | Query      |  681 | Waiting for table flush      | flush tables with read lock  |         0 |             0 |         1 |
| 10291 | root | localhost | dumptest | Query      |   74 | Waiting for global read lock | insert into C values ("a",1) |         0 |             0 |         1 |
| 10304 | root | localhost | dumptest | Field List |   15 | Waiting for table flush      |                              |         0 |             0 |         4 |
| 10305 | root | localhost | dumptest | Query      |    0 | NULL                         | show processlist             |         0 |             0 |         1 |
+-------+------+-----------+----------+------------+------+------------------------------+------------------------------+-----------+---------------+-----------+
5 rows in set (0.00 sec)

As result this means single run away select can effectively cause downtime if you use backup solution which does FLUSH TABLES WITH READ LOCK which is very unfortunate.

As the real solution I’d love to see this problem fixed. There is no reason (other than code design) for read lock to wait for other read lock in order to be set.

As a workaround you can consider having scripts which would check for situation like above and either kill FLUSH TABLES WITH READ LOCK and fail backup or kill long running SELECT queries to let backup to proceed, but resolving server gridlock one way or another.

If you’re just using Innodb tables and you’re not actively changing users, stored procedures etc (which are stored in MyISAM tables anyway) you can consider using –no-lock option with Percona Xtrabackup. Many other tools have similar option. This option can be especially helpful when backing up data from the slave with –safe-slave-backup option, as preventing writes to the slave can be done by simply pausing replication.