One of the things that has always bothered me about replication is that the binary logs are written to disk and then read from disk.
There is are two threads which are for the most part, unaware of each other.
One thread reads the remote binary logs, and the other writes them to disk.
While the Linux page buffer CAN work to buffer these logs, the first write will cause additional disk load.
One strategy, which could seriously boost performance in some situations, would be to pre-read say 10-50MB of data and just keep it in memory.
If a slave is catching up, it could have GIGABYTES of binary log data from the master. It would then write this to disk. These reads would then NOT come from cache.
Simply using a small buffer could solve this problem.
One HACK would be to use a ram drive or tmpfs for logs. I assume that the log thread will block if the disk fills up… if it does so …
[Read more]