The sort buffer, which is controlled by the
sort_buffer_size
setting, is probably one of the
most-discussed MySQL settings. Interestingly, in many cases it
has a lot less potential to help performance than to hurt it, so
configuring it is often about avoiding trouble rather than
gaining a lot of performance.
The variable controls the size of a buffer that’s created whenever MySQL has to sort rows. It is per-query, meaning each query gets its own buffer, and it’s allocated to its full size, not as-much-as-needed. This makes large settings potentially dangerous.
The worst abuse of this variable we’ve seen came from a server that was tuned with a script. The script relied on a naive formula that looked at a simplistic ratio of some server variables. Due to the server’s workload the script was never satisfied and continually suggested increasing this variable, which eventually was set to 1GB. The effect was that every time the server needed to sort a few rows, 1GB of memory was allocated. As you might expect, this was not only slow, but crashed the server pretty frequently due to the out-of-memory killer being invoked.
What if the variable is too small? This matters most when there’s a large number of rows to sort (the type of situation you might see with a large analytical query). In this case, the server will generate rows and write them to the buffer in memory, fill the buffer, then write the buffer to a sort file and repeat. The server then sorts these temporary files and merges them to sort the whole result-set. If this happens many times, it can be slow.
This variable can be set per-connection temporarily if such a large query is anticipated, and that’s probably better than setting it large globally. Settings that help large queries can seriously hurt small, fast queries.
You can find out whether a query is potentially a candidate for a
larger sort buffer by looking at the number of rows sorted or the
sort merge passes required. VividCortex can use regression to
estimate this on a per-query basis (MySQL makes it available as a
global status counter, not per-query, unless you are using the
Performance Schema in MySQL 5.6 and later). If you identify
queries that might benefit from a larger sort buffer, you can
then try some experiments by hand. You can look at the
Sort_merge_passes
local (not global) status counter
before and after running the query, and tune and time until you
determine whether the setting helps.
If you do configure a large per-connection setting, be sure that you either close the connection or reset the variable before, for example, returning the connection to a connection pool where the setting might impact other queries.
To read more from our ebook “Configuing MySQL for Performance,” you can download a free copy here.