Recently i worked on a production issue for one of our client under support .They have a architecture of a three node Galera cluster with one asynchronous slave .
- Node1 – 18.104.22.168
- Node2 – 22.214.171.124
- Node3 – 126.96.36.199
- Replica – 188.8.131.52
The slave(replica) was configured with node3 as replica master. Unfortunately the node 3 was crashed with an OOM killer ,also server has a low gcache size, so when i am trying to start the node 3 , it went to SST . Here the data size was around 2.6 TB , in general for completion of whole SST and joining the node back to cluster will take around approximately 12 hours.
As i told earlier, the replication slave was under …[Read more]