This is a followup to Jay Janssen’s October post, “Using keepalived for HA on top of Percona XtraDB Cluster.” We got a request recently where the customer has 2 VIPs (Virtual IP addresses), one for reader and one for a writer for a cluster of 3 nodes. They wanted to keep it simple, with low latency and does not require an external node resource like HaProxy would.
keepalived is a simple load balancer with HA capabilities, which means it can proxy TCP services behind it and at the same time, keep itself highly available using VRRP as failover mechanism. This post is about taking advantage of the VRRP capabilities built into keepalived to intelligently manage your PXC VIPs.
While Yves Trudeau also wrote a very interesting and somewhat similar solution using ClusterIP and Pacemaker to load balance VIPs, they have different use cases. Both solutions reduce latency from an external proxy or load balancer, but unlike ClusterIP, connections to the desired VIP with keepalived go to a single node which means a little less work for each node trying to see if they should respond to the request. ClusterIP is good if you want to send writes to all nodes in calculated distribution while with our keepalived option, each VIP at best assigned to only a single node – depending on your workload, each will have advantages and disadvantages.
The OS I used was CentOS 6.4 with keepalived 1.2.7 available in the yum repositories, however, it’s difficult to troubleshoot failover behavior with VRRP_Instance weights without seeing them from keepalived directly. So I used a custom build, with a patch for –vrrp-status option that allows me to monitor something like this:
[root@pxc01 keepalived]# keepalived --vrrp-status VRRP Instance : writer_vip Interface : eth5 Virtual Router ID : 60 State : BACKUP Virtual IP address : 192.168.56.83 Advertisement interval : 1 sec Preemption : Enabled, delay 0 secs Priority : 101 Effective Priority : 101 Authentication : NONE Master router : 192.168.56.44 priority 151 Master down interval : 3.6 VRRP Instance : reader_vip Interface : eth5 Virtual Router ID : 61 State : MASTER Virtual IP address : 192.168.56.84 Advertisement interval : 1 sec Preemption : Enabled, delay 0 secs Priority : 101 Effective Priority : 181 Authentication : NONE Master router : 192.168.56.42 (local)
So first, let’s compile keepalived from source, the Github branch here is where the status patch is available.
cd ~ git clone https://github.com/jonasj76/keepalived.git git checkout 5c5b2cc51760967c92b968d6e886ab6ecc2ee86d git branch 5c5b2cc51760967c92b968d6e886ab6ecc2ee86d ./configure make && make install
Install the customer tracker script below – because compiling keepalived above installs it on /usr/local/bin, I put this script there as well. One would note that this script is completely redundant, it’s true, but beware that keepalived does not validate its configuration, especially track_scripts so I prefer to have it on separate bash script so I can easily debug misbehavior. Of course when all is working well, you can always merge this to the keepalived.conf file.
#!/bin/bash # Modify these addresses to match your reader and writer VIPs WRITER_VIP=192.168.56.83 READER_VIP=192.168.56.84 # Make sure your clustercheck script also works PXC_CHECK='/usr/bin/clustercheck clustercheck password 0' SCRIPT=$1 WEIGHT=101 case $SCRIPT in 'bad_pxc') $PXC_CHECK || exit 1 ;; 'nopreempt_writer') [[ "$(hostname|cut -d'.' -f1)" != 'pxc01' && $(ip ad sh|grep $WRITER_VIP) && $(ip ad sh|grep $READER_VIP|grep -c inet) -eq 0 ]] || exit 1 ;; 'nopreempt_reader') [[ "$(hostname|cut -d'.' -f1)" != 'pxc02' && $(ip ad sh|grep $READER_VIP) && $(ip ad sh|grep $WRITER_VIP|grep -c inet) -eq 0 ]] || exit 1 ;; 'repel_writer') [ $(ip ad sh|grep $WRITER_VIP|grep -c inet) -eq 0 ] || exit 1 ;; esac exit 0
And below is my /etc/keepalived.conf:
vrrp_script nopreempt_writer_vip { script "/usr/local/bin/pxc-track nopreempt_writer" interval 2 } vrrp_script nopreempt_reader_vip { script "/usr/local/bin/pxc-track nopreempt_reader" interval 2 } vrrp_script repel_writer_vip { script "/usr/local/bin/pxc-track repel_writer" interval 2 } vrrp_script bad_pxc { script "/usr/local/bin/pxc-track bad_pxc" interval 2 } vrrp_instance writer_vip { interface eth5 state BACKUP virtual_router_id 60 priority 101 virtual_ipaddress { 192.168.56.83 } track_script { nopreempt_writer_vip weight 50 bad_pxc weight -100 } track_interface { eth5 } notify_master "/bin/echo 'writer now master' > /tmp/keepalived-w.state" notify_backup "/bin/echo 'writer now backup' > /tmp/keepalived-w.state" notify_fault "/bin/echo 'writer now fault' > /tmp/keepalived-w.state" } vrrp_instance reader_vip { interface eth5 state BACKUP virtual_router_id 61 priority 101 virtual_ipaddress { 192.168.56.84 } track_script { repel_writer_vip weight 30 nopreempt_reader_vip weight 50 bad_pxc weight -100 } track_interface { eth5 } ! This does not work properly if we stop the MySQL process ! VIP seems to stick on the node so we have separate nopreempt_* track_scripts !nopreempt notify_master "/bin/echo 'reader now master' > /tmp/keepalived-r.state" notify_backup "/bin/echo 'reader now backup' > /tmp/keepalived-r.state" notify_fault "/bin/echo 'reader now fault' > /tmp/keepalived-r.state" }
There are a number of things you can change here like remove or modify the notify_* clauses to fit your needs or send SMTP notifications during VIP failovers. I also prefer the initial state of the VRRP_Instances to be on BACKUP instead of master and let the voting on runtime dictate where the VIPs should go.
The configuration ensures that the reader and writer will not share a single node if more than one is available in the cluster. Even though the writer VIP prefers pxc01 in my example, this does not really matter much and only makes a difference when the reader VIP is not in the picture, there is no automatic failback with the help of the nopreempt_* track_scripts.
Now, to see it in action, after starting the cluster and keepalived in order pxc01, pxc02, pxc03, I have these statuses and weights:
[revin@forge ~]$ for h in pxc01 pxc02 pxc03; do echo "$h $(ssh root@$h 'cat /tmp/keepalived-w.state /tmp/keepalived-r.state'|xargs)"; done pxc01 writer now master reader now backup pxc02 writer now backup reader now master pxc03 writer now backup reader now backup pxc01 2014-01-15_20_58_23 writer_vip 161 reader_vip 101 pxc02 2014-01-15_20_58_28 writer_vip 101 reader_vip 131 pxc03 2014-01-15_20_58_36 writer_vip 131 reader_vip 131
The writer is on pxc01 and reader on pxc02 – even though the reader VIP score between pxc02 and pxc03 matches, it remains on pxc02 because of our nopreempt_* script. Let’s see what happens if I stop MySQL on pxc02:
[revin@forge ~]$ for h in pxc01 pxc02 pxc03; do echo "$h $(ssh root@$h 'cat /tmp/keepalived-w.state /tmp/keepalived-r.state'|xargs)"; done pxc01 writer now master reader now backup pxc02 writer now backup reader now backup pxc03 writer now backup reader now master pxc01 2014-01-15_21_01_17 writer_vip 161 reader_vip 101 pxc02 2014-01-15_21_01_24 writer_vip 31 reader_vip 31 pxc03 2014-01-15_21_01_36 writer_vip 101 reader_vip 181
The reader VIP moved to pxc03 and the weights changed, pxc02 reader dropped by 100 and on pxc03 it gained by 50 – again we set this higher for nor preempt. Now let’s stop MySQL on pxc03:
[revin@forge ~]$ for h in pxc01 pxc02 pxc03; do echo "$h $(ssh root@$h 'cat /tmp/keepalived-w.state /tmp/keepalived-r.state'|xargs)"; done pxc01 writer now master reader now master pxc02 writer now backup reader now backup pxc03 writer now backup reader now backup pxc01 2014-01-15_21_04_43 writer_vip 131 reader_vip 101 pxc02 2014-01-15_21_04_49 writer_vip 31 reader_vip 31 pxc03 2014-01-15_21_04_56 writer_vip 31 reader_vip 31
All our VIPs are now on pxc01, let’s start MySQL on pxc02:
[revin@forge ~]$ for h in pxc01 pxc02 pxc03; do echo "$h $(ssh root@$h 'cat /tmp/keepalived-w.state /tmp/keepalived-r.state'|xargs)"; done pxc01 writer now master reader now backup pxc02 writer now backup reader now master pxc03 writer now backup reader now backup pxc01 2014-01-15_21_06_41 writer_vip 161 reader_vip 101 pxc02 2014-01-15_21_06_50 writer_vip 101 reader_vip 131 pxc03 2014-01-15_21_06_55 writer_vip 31 reader_vip 31
Our reader is back to pxc02 and writer remains intact. When both VIPs end up on a single node i.e. last node standing, and a second node comes up, the reader moves not the writer this is to prevent any risks in breaking any connections that may be writing to the node currently owning the VIP.
The post keepalived with reader and writer VIPs for Percona XtraDB Cluster appeared first on MySQL Performance Blog.