How to make MaxScale High Available with Corosync/Pacemaker

Fri, 2014-08-22 09:55massimiliano_pinto_g

MaxScale, an open-source database-centric router for MySQL and MariaDB makes High Availability possible by hiding the complexity of backends and masking failures. MaxScale itself however is a single application running in a Linux box between the client application and the databases - so how do we make MaxScale High Available? This blog post shows how to quickly setup a Pacemaker/Corosync environment and configure MaxScale as a managed cluster resource.

Anyone following the instructions detailed here, modifying configuration files and issuing system and software checks could create a complete setup with three Linux Centos 6.5 servers and unicast heartbeat mode.

In a few steps MaxScale will be ready for basic HA operations and one simple failure test, the running process manually killed, is showed as an example.

We make the following assumptions here:

  • The solution is a quick setup example that may not be suited for all production environments.
  • Pacemaker/Corosync and crmsh command line tools usage is known at a basic level
  • A Virtual IP is set providing the access to the MaxScale process
  • MaxScale is already configured and working with a MariaDB/MySQL replication setup or MariaDB Galera Cluster
  • MaxScale process is started/stopped and monitored via /etc/init.d/maxscale LSB compatible script that is available in RPM package from version 1.0. The script might be found in the GitHub repository, for Ubuntu as well.

Step 1 - Clustering Software installation

On each cluster node do the following operations:

Let’s start enabling  a new repo

# vi /etc/yum.repos.d/ha-clustering.repo

and add the following lines to the file

[haclustering]
name=HA Clustering
baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/
enabled=1
gpgcheck=0

Now install the software.

# yum install pacemaker corosync crmsh

Please note the packages versions used in this setup are:

  • pacemaker-1.1.10-14.el6_5.3.x86_64
  • corosync-1.4.5-2.4.x86_64
  • crmsh-2.0+git46-1.1.x86_64

Step 2 - Configuring  the system

Let’s begin assigning the hostname to each node:

The node names are: node1,node,node3

# hostname node1
...
# hostname nodeN

and write entries in /etc/hosts:

For each node add the server names and current-node, that is as an alias for the current server.

# vi /etc/hosts
10.74.14.39     node1
10.228.103.72   node2
10.35.15.26     node3 current-node     ...
# vi /etc/hosts
10.74.14.39     node1 current-node
10.228.103.72   node2
10.35.15.26     node3

Prepare authkey for optional cryptographic use:

On one of the nodes, say node2 run the corosync-keygen utility and follow the instructions.

[root@node2 ~]# corosync-keygen

Corosync Cluster Engine Authentication key generator.
       Gathering 1024 bits for key from /dev/random.
       Press keys on your keyboard to generate entropy.

After completion the key will be found in /etc/corosync/authkey.

Now let’s create the corosync configuration file:

[root@node2 ~]# vi /etc/corosync/corosync.conf

Add the following content to the file:

# Please read the corosync.conf.5 manual page
compatibility: whitetank

totem {
        version: 2
        secauth: off
        interface {
                member {
                        memberaddr: node1
                }
                member {
                        memberaddr: node2
                }
                member {
                        memberaddr: node3
                }
             ringnumber: 0
                 bindnetaddr: current-node
                 mcastport: 5405
                 ttl: 1
        }
        transport: udpu
}

logging {
        fileline: off
        to_logfile: yes
        to_syslog: yes
        logfile: /var/log/cluster/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off

        }
}

# this will start Pacemaker processes
service {
ver: 0
name: pacemaker
}

A few notes here:

  • Unicast UDP is used
  • bindnetaddr for Corosync process is “current-node”, that has the right value on each node due to the alias added in /etc/hosts above
  • Pacemaker processes are started by the Sorosync daemon, so there is no need to launch it via /etc/init.d/pacemaker start

We can now copy configuration files and auth key on each of the other nodes:

[root@node2 ~]# scp /etc/corosync/*  root@node1:/etc/corosync/
...
[root@node2 ~]# scp /etc/corosync/*  root@nodeN:/etc/corosync/

Step 3 - Start the Cluster

The Cluster can be started now but let’s do additional checks before proceeding.
Corosync needs UDP port 5405 to be opened so we need to configure any firewall or iptables accordingly.

For a quick start just disable iptables on each nodes:

[root@node2 ~]# service iptables stop
…
[root@nodeN ~]# service iptables stop

Let’s start Corosync on each node:

[root@node2 ~] #/etc/init.d/corosync start
…
[root@nodeN ~] #/etc/init.d/corosync start

and check if the corosync daemon is successfully bound to port 5405:

[root@node2 ~] #netstat -na | grep 5405

udp        0      0 10.228.103.72:5405        0.0.0.0:*

Check also if other nodes are reachable with nc utility and option UDP (-u):

[root@node2 ~] #echo "check ..."  | nc -u node1 5405
[root@node2 ~] #echo "check ..."  | nc -u node3 5405
…
[root@node1 ~] #echo "check ..."  | nc -u node2 5405
[root@node1 ~] #echo "check ..."  | nc -u node3 5405

If the following message is displayed:

nc: Write error: Connection refused

there is an issue with communication between the nodes, this is most likely to be an issue with the firewall configuration on your nodes.

Please check and resolve issues with your firewall configuration.

We can check the cluster status, from any node, with this command:

[root@node3 ~]# crm status

After a while the output will look like:

[root@node3 ~]# crm status
Last updated: Mon Jun 30 12:47:53 2014
Last change: Mon Jun 30 12:47:39 2014 via crmd on node2
Stack: classic openais (with plugin)
Current DC: node2 - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
3 Nodes configured, 3 expected votes
0 Resources configured


Online: [ node1 node2 node3 ]

The Cluster has been started successfully, that’s the first achievement so far!

Please note, in the basic setup we will disable the following properties:

  • stonith
  • quorum policy
[root@node3 ~]# crm configure property 'stonith-enabled'='false'
[root@node3 ~]# crm configure property 'no-quorum-policy'='ignore'

After these commands the configuration is automatically updated on every node and we want to check it from another node, say node1

[root@node1 ~]# crm configure show

node node1
node node2
node node3
property cib-bootstrap-options: \
        dc-version=1.1.10-14.el6_5.3-368c726 \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes=3 \
        stonith-enabled=false \
        no-quorum-policy=ignore \
        placement-strategy=balanced \
        default-resource-stickiness=infinity

Well done, the Corosync / Pacemaker cluster is now ready to manage resources, in the next steps we’ll add MaxScale.

Step 4 - Check MaxScale init script

The new MaxScale /etc/init.d./maxscale script allows to start/stop/restart and monitor MaxScale process running in the system.

The script found in the RPM package is already working with the following path:
/usr/local/skysql/maxscale

It might be necessary to modify some variables such as MAXSCALE_HOME to match the installation directory you choose when you installed MaxScale or MAXSCALE_PIDFILE or LD_LIBRARY_PATH

We assume here MaxScale is configured with a MariaDB/MySQL replication setup or MariaDB Galera Cluster and those servers might be located in the three Linux boxes we are using or anywhere else.

Following commands should be issued on each node, assuring the application could run and managed:

[root@node1 ~]# /etc/init.d/maxscale 
Usage: /etc/init.d/maxscale {start|stop|status|restart|condrestart|reload}

Start

[root@node1 ~]# /etc/init.d/maxscale start
Starting MaxScale: maxscale (pid 25892) is running...      [  OK  ]

Start again

[root@node1 ~]# /etc/init.d/maxscale start
Starting MaxScale:  found maxscale (pid  25892) is running.[  OK  ]

Stop

[root@node1 ~]# /etc/init.d/maxscale stop
Stopping MaxScale:                                         [  OK  ]

Stop again

[root@node1 ~]# /etc/init.d/maxscale stop
Stopping MaxScale:                                         [FAILED]

Status (MaxScale not running)

[root@node1 ~]# /etc/init.d/maxscale status
MaxScale is stopped                                        [FAILED]

Status (MaxScale is running)

[root@node1 ~]# /etc/init.d/maxscale status
Checking MaxScale status: MaxScale (pid  25953) is running.[  OK  ]

As MaxScale script is LSB compatible, returns the proper exit code for each action, it’s now possible to configure the application as a resource in Pacemaker, next step will show how to do it.

Step 5 - Configure MaxScale as a cluster resource

We are assuming here MaxScale could run on each node with the same configuration file.

[root@node2 ~]# crm configure primitive MaxScale lsb:maxscale \
op monitor interval=”10s” timeout=”15s” \
op start interval=”0” timeout=”15s” \
op stop interval=”0” timeout=”30s”

The command above has configured MaxScale as a LSB resource, note “lsb:maxscale”

In Pacemaker there are two different ways for managing applications:

  • Resource Agents (VIP, MySQL, Filesystem etc)
  • LSB scripts for applications that don’t require the complexity of a resource agent and custom applications, in general.

MaxScale itself manages the backend servers we had configured in etc/MaxScale.cnf service sections such as:

[RW Split Router]
type=service
router=readwritesplit
servers=server1,server2,server3,server4,server5,server6,server7
user=maxuser
passwd=maxpwd

So we only want Pacemaker to manage the MaxScale process and the LSB approach is well suitable here.

If everything is fine we should see the resource running:

[root@node2 ~]# crm status
Last updated: Mon Jun 30 13:15:34 2014
Last change: Mon Jun 30 13:15:28 2014 via cibadmin on node2
Stack: classic openais (with plugin)
Current DC: node2 - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
3 Nodes configured, 3 expected votes
1 Resources configured

Online: [ node1 node2 node3 ]

MaxScale        (lsb:maxscale): Started node1

Well done, another achievement here!

We now have MaxScale running via Pacemaker and we don’t need anymore to have it started via /etc/init.d at boot time!
Pacemaker will do all the job but it needs to be started at boot: with CentOS 6.5 setup we need at least:

# chkconfig maxscale off
# chkconfig corosync on

Step 6 - Does the HA software work? Let’s see a resource restarted after a failure:

MaxScale application is now managed by the HA clustering software but what does it mean?

Will the application be restarted in case of any failure? It should be!

We try now to kill the MaxScale process and see what will happen ...

As we now MaxScale PID could be easily found in $MAXSCALE_HOME/log/maxscale.pid

In this example the PID is 26114, and we kill the process with brute force:

[root@node2 ~]# kill -9 26114

[root@node2 ~]# crm status
Last updated: Mon Jun 30 13:16:11 2014
Last change: Mon Jun 30 13:15:28 2014 via cibadmin on node2
Stack: classic openais (with plugin)
Current DC: node2 - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
3 Nodes configured, 3 expected votes
1 Resources configured

Online: [ node1 node2 node3 ]

Failed actions:
    MaxScale_monitor_15000 on node1 'not running' (7): call=19, status=complete, last-rc-change='Mon Jun 30 13:16:14 2014', queued=0ms, exec=0ms

Note the MaxScale_monitor failed action above and ... after a few seconds it will be started again:

[root@node2 ~]# crm status
Last updated: Mon Jun 30 13:16:22 2014
Last change: Mon Jun 30 13:15:28 2014 via cibadmin on node1
Stack: classic openais (with plugin)
Current DC: node2 - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
3 Nodes configured, 3 expected votes
1 Resources configured


Online: [ node1 node2 node3 ]

 MaxScale       (lsb:maxscale): Started node1 

The Clustering HA software will keep MaxScale running in one of the three Linux boxes we have but … which node? and how could we connect to MaxScale from our client application if we don’t know where it runs?

# mysql -h $MAXSCALE_IP -P 4006 -utest -p test

What is the $MAXSCALE_IP then? Let’s Follow the last step ...

Step 7 - Add a Virtual IP (VIP) to the cluster

The solution for $MAXSCALE_IP is that MaxScale process should be contacted using one known IP, that may move across nodes with MaxScale as well.

The setup is very easy: assuming an addition IP address is available and that it can be added to one of the nodes, this i the new configuration to add:

[root@node2 ~]# crm configure primitive maxscale_vip ocf:heartbeat:IPaddr2 params ip=192.168.122.125 op monitor interval=10s

There is of course another action to do: MaxScale process and the VIP must be run in the same node, so it’s mandatory to add to the configuration the group ‘maxscale_service’.

[root@node2 ~]# crm configure group maxscale_service maxscale_vip MaxScale

Here is the final configuration:

[root@node3 ~]# crm configure show
node node1
node node2
node node3
primitive MaxScale lsb:maxscale \
        op monitor interval=15s timeout=10s \
        op start interval=0 timeout=15s \
        op stop interval=0 timeout=30s
primitive maxscale_vip IPaddr2 \
        params ip=192.168.122.125 \
        op monitor interval=10s
group maxscale_service maxscale_vip MaxScale \
        meta target-role=Started
property cib-bootstrap-options: \
        dc-version=1.1.10-14.el6_5.3-368c726 \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes=3 \
        stonith-enabled=false \
        no-quorum-policy=ignore \
        placement-strategy=balanced \
        last-lrm-refresh=1404125486

Check the resource status:

[root@node1 ~]# crm status
Last updated: Mon Jun 30 13:51:29 2014
Last change: Mon Jun 30 13:51:27 2014 via crmd on node1
Stack: classic openais (with plugin)
Current DC: node2 - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
3 Nodes configured, 3 expected votes
2 Resources configured

Online: [ node1 node2 node3 ]

 Resource Group: maxscale_service
     maxscale_vip       (ocf::heartbeat:IPaddr2):       Started node2 
     MaxScale   (lsb:maxscale): Started node2 

With both resources on node2, now MaxScale service will be reachable via the configured VIP address 192.168.122.125:

# mysql -h 192.168.122.125 -P 4006 -utest -p test

Please note our three Linux boxes setup require now four IP addresses: one for each node plus the moving IP address assigned to MaxScale

Summary

The goal of this post was to present a quick HA solution for a running MaxScale setup, using a widely adopted open-source clustering solution.

Even though the main content could be seen as a basic Corosync/Pacemaker setup guide, I encourage you to look for other failure scenarios and all the cluster administrative commands such as moving resources, adding constraints that could be found through the links below.

The reader might fin the LSB script tutorials interesting too, just enabling another application to the HA side,

Resources

MaxScale

Clustering Software

LSB scripts

Tags: ClusteringHigh AvailabilityHowtoMaxScale About the Author Massimiliano Pinto

Massimiliano is a Senior Software Engineer working mainly on MaxScale. Massimiliano has worked for almost 15 years in Web Companies playing the roles of Technical Leader and Software Engineer. Prior to joining SkySQL he worked at Banzai Group and Matrix S.p.A, big players in the Italy Web Industry. He is still a guy who likes too much the terminal window on his Mac. Apache modules and PHP extensions skills are included as well.