The other day I was working on an issue where one of the slaves was showing unexpected lag. Interestingly with only the IO thread running the slave was doing significantly more IO as compared to the rate at which the IO thread was fetching the binary log events from the master.
I found this out by polling the SLAVE STATUS and monitoring the value of Read_Master_Log_Pos as it changed over time. Then compared it to the actual IO being done by the server using the pt-diskstats tool from the excellent Percona Toolkit. Note that, when doing this analysis, I had already stopped the slave SQL thread and made sure that there were no dirty InnoDB pages, otherwise my analysis would have …[Read more]