In my previous post about InnoDB Stalls on Empty Free List, I used a test environment that might look a little artificial : a table larger than the InnoDB Buffer Pool but fitting in the Linux Page Cache. This configuration allows serving very quickly what MySQL thinks are IOs because these are hit in the filesystem cache. In this post, I explain why this environment is not totally artificial
If, in Percona Server, you are observing tail latencies on queries that should be fast, this might be a side effect of Percona's improved InnoDB Empty Free List Algorithm. When using this algorithm (the default in 5.6 and 5.7 and optional configuration in 8.0), a query needing a free page while none are available waits until the LRU Manager Thread refills the free list. Because this thread is
We quite often say, that benchmark performance is usually different from real world performance – so performance engineering usually has to cover both – benchmarks allow to understand sustained performance bottlenecks, and real world analysis usually concentrates on something what would be considered ‘exceptional’ and not important in benchmarks – stalls of various kind. They are extremely important, as the state when our performance is lowest is the state of performance we provide to our platform users.
On a machine that is doing 5000qps, stalling for 100ms means that 500 queries were not served as fast as they could, or even hit application timeouts or exceptional MySQL conditions (like 1023 transaction limit). Of course, stalling for a second means 5000 queries were not served in time…
We have multiple methods to approach this – one is our …[Read more]