We had a one of our slave servers frequently stop replicating with the “Innodb Lock Wait Timeout” error. The slave IO thread would continue to fetch the binlogs while the slave SQL thread kept stopping with the above mentioned error. The teams initial inclination was to change the innodb lock wait timeout variable from 50 secs to a higher value. It was a read-only slave. Our expectation was there would be no competing writes. Then we started listing what are the next steps possible and what could be wrong.
- There could be a user with “super” privilege in the system that was running updates directly on the slave
- A backup script that could be locking the tables out for backup
- Increase the “innodb lock wait timeout variable“
- Enable the innodb lock monitor
While we were working on the system, we noticed that there were few select queries that …[Read more]