Even when the output of EXPLAIN doesn’t show “using temporary”, a temporary file may still be used in certain cases.
That’s not to say the query needs the temporary file to actually resolve the query (like what you’d see from the need for a derived table). But rather, the temporary file I’m speaking of is due to binary logging.
In particular, you can see this easily if using InnoDB, (most
commonly) row-based binary logging, and you issue a
large transaction, say a large UPDATE
(large meaning
something larger than the size of binlog_cache_size). In this case, you’ll
notice a temporary file being created in the MySQL tmpdir (if you
were using lsof
or something similar to monitor the
tmpdir).
If you have row-based binary logging enabled, this file would
have a name like ML*
. If you are not using row-based
binary logging, it will have a file name like MY*
(unlike the #sql*
tmp files most of us are
accustomed to).
Say you run a large UPDATE
which updates millions of
rows (note the UPDATE
is a single transaction in
itself). If you have row-based binary log format enabled, then
each row is written to the binary log. However, before being
written to the binary log, the changes are first written to the
binlog_cache (i.e., it will not write the tranaction to the
binary log until the transaction commits).
With a small value of binlog_cache_size
(which the
default is only 32K), it is easy to overrun this cache, hence you
end up with a temporary file being created/used to store this
information until the transaction completes.
So, to avoid this, you have a couple of options:
- o Increase binlog_cache_size (max is 4G)
- o Change overall binary log format to MIXED mode
- o Dynamically change session to MIXED/STATEMENT mode before running the query
(In either of the latter 2 options, the query must be deterministic.)
For increasing binlog_cache_size
, there are a couple
of things you can look at.
In the general case, you can monitor this cache from the
SHOW GLOBAL STATUS
:
mysql> show global status like 'binlog%'; +-----------------------+----------+ | Variable_name | Value | +-----------------------+----------+ | Binlog_cache_disk_use | 652065 | | Binlog_cache_use | 38271899 | +-----------------------+----------+
In this case, you could compute the efficiency:
((1 - (652065/38271899))*100) = 98.3%.
This is not a terrible ratio, but there is definitely room for improvement. So again, when tuning this in the general case, increase in smaller intervals, and monitor the changes. Then iteratively re-apply this process until you’re comfortable with the value/ratio.
Now, in the more specific case, suppose you were monitoring the tmpdir with lsof, then you might see something like this:
chris@chris-linux:~/mysql-5.5.8/bin$ lsof | grep tmpdir mysqld 17499 chris 5u REG 8,5 0 246876 /mysql-5.5.8/tmpdir/ibr26n5u (deleted) mysqld 17499 chris 6u REG 8,5 0 246877 /mysql-5.5.8/tmpdir/ibqsbazO (deleted) mysqld 17499 chris 7u REG 8,5 0 246943 /mysql-5.5.8/tmpdir/ibYCyW27 (deleted) mysqld 17499 chris 8u REG 8,5 0 246947 /mysql-5.5.8/tmpdir/ibnrd9xr (deleted) mysqld 17499 chris 12u REG 8,5 0 246948 /mysql-5.5.8/tmpdir/ibT0q36K (deleted) mysqld 17499 chris 36u REG 8,5 320057735 246960 /mysql-5.5.8/tmpdir/MLcPabAK (deleted)
The tmp files named ib*
are related to InnoDB, but
that is another discussion.
The file in question is MLcPabAK
, which grew to over
300M while the UPDATE
was executing.
Thus, if I was only trying to eliminate the need for the tmp file
in this instance, I know my binlog_cache_size
would
need to exceed 300M.
Morals of the story:
If running with InnoDB and row-based replication, you should
check the value of Binlog_cache_disk_use (and compare to binlog_cache_use) and increase
binlog_cache_size
if necessary. You may find it a
good idea to tune this anyway.
Further, if running large, deterministic UPDATES, you might want
to consider changing the binlog_format
for the
duration of said queries, to simply avoid all of the extra I/O.
Aside Comment:
Upcoming change to MySQL 5.5.9:
Begining with MySQL 5.5.9, binlog_cache_size
sets
the size for the transaction cache only, and the size of the
statement cache is governed by the binlog_stmt_cache_size system variable.
So be sure to check out the above if you are using MySQL 5.5.9 or newer. Also, you might be interested in max_binlog_cache_size and max_binlog_stmt_cache_size.
Click here to give this article a “Thumbs Up” on Planet MySQL !