I mentioned in my Galera crash (un)safe post that it’s bad if a SST is triggered on a large dataset. On a Galera node an SST will simply wipe the contents of the MySQL data directory and copy a snapshot from another node in the cluster. For consistency this is a favorable thing to do, but from an HA perspective it isn’t: one node in the cluster is unavailable and another one is acting as donor. This means, if you have a three node Galera cluster, only two are available where one has “degraded” performance.
This could have quite high impact, for instance in the proxysql_galera_checker script only synced nodes are selected to receive traffic. By default the donor nodes will not receive any traffic and now the cluster only has one node left to serve traffic. Don’t worry: if there are no synced …[Read more]