There has been some interesting discussion on-line recently about how to handle database (meaning MySQL, but really it applies to other systems too) failover. The discussion that I’ve followed so far, in order, is:
- GitHub’s report on their automated failover and downtime issues
- Baron’s follow-up to that, Is automated failover the root of all evil? which generated a bit of discussion in the comments
- Peter’s follow-up to both, The Math of Automated Failover which heads in the direction I was going when I realized I might want to toss my 2 cents into the mix

