The more work on orchestrator, the more user input and the more production experience, the more insights I get into MySQL master recoveries. I'd like to share the complexities in correctly running general-purpose master failovers; from picking up the right candidates to finalizing the promotion.
The TL;DR is: we're often unaware of just how things can turn at the time of failover, and the impact of every single decision we make. Different environments have different requirements, and different users wish to have different policies. Understanding the scenarios can help you make the right choice.
The scenarios and considerations below are ones I picked while
browsing through the
orchestrator code and through
Issues and questions. There are more. There are always more
I discuss "normal replication" scenarios below; some of these will …[Read more]