Introduction –
Recently i worked on a production issue for one of our client under support .They have a architecture of a three node Galera cluster with one asynchronous slave .
- Node1 – 172.10.2.11
- Node2 – 172.10.2.12
- Node3 – 172.10.2.13
- Replica – 172.10.2.14
Architecture –
The slave(replica) was configured with node3 as replica master. Unfortunately the node 3 was crashed with an OOM killer ,also server has a low gcache size, so when i am trying to start the node 3 , it went to SST . Here the data size was around 2.6 TB , in general for completion of whole SST and joining the node back to cluster will take around approximately 12 hours.
As i told earlier, the replication slave was under …
[Read more]