Fri, 2015-07-31 12:14guillaumelefranc
Mandatory disclaimer: the techniques described in this blog post are experimental, so use at your own risk. Neither me nor MariaDB Corporation will be held responsible if anything bad happens to your servers.
Context
MaxScale 1.2.0 and above can call external scripts on monitor events. In the case of a classic Master-Slave setup, this can be used for automatic failover and promotion using MariaDB Replication Manager. The following use case is exposed using three MariaDB servers (one master, two slaves) and a MaxScale server. Please refer to my Vagrant files if you want to jumpstart such a testing platform.
Requirements
- A mariadb-repmgr binary, version 0.4.0 or above. Grab it from the github Releases page, and extract in /usr/local/bin/ on your MaxScale server.
- A working MaxScale installation with MySQL Monitor setup and whatever router you like. Please refer to the MaxScale docs for more information on how to configure it correctly.
MaxScale installation and configuration
The MySQL Monitor has to be configured to send scripts. Add the following three lines to your [MySQL Monitor] section:
monitor_interval=1000 script=/usr/local/bin/failover.sh events=master_down
Failover script
As of the current MaxScale development branch, custom options are not supported, so we have to use a wrapper script to call MariaDB Replication Manager. Create the following script in /usr/local/bin/failover.sh:
#!/bin/bash # failover.sh # wrapper script to repmgr # user:password pair, must have administrative privileges. user=root:admin # user:password pair, must have REPLICATION SLAVE privileges. repluser=repluser:replpass ARGS=$(getopt -o '' --long 'event:,initiator:,nodelist:' -- "$@") eval set -- "$ARGS" while true; do case "$1" in --event) shift; event=$1 shift; ;; --initiator) shift; initiator=$1 shift; ;; --nodelist) shift; nodelist=$1 shift; ;; --) shift; break; ;; esac done cmd="mariadb-repmgr -user $user -rpluser $repluser -hosts $nodelist -failover=dead" eval $cmd
Make sure to configure user and repluser script variables to whatever your user:password pairs are for administrative user and replication user. Also make sure to make the script executable (chown +x) as it's very easy to forget that step.
Testing that failover works
Let's check the current status, where I have configured server3 as a master and server1-2 as slaves:
$ maxadmin -pmariadb "show servers" Server 0x1b1f440 (server1) Server: 192.168.56.111 Status: Slave, Running Protocol: MySQLBackend Port: 3306 Server Version: 10.0.19-MariaDB-1~trusty-log Node Id: 1 Master Id: 3 Slave Ids: Repl Depth: 1 Number of connections: 0 Current no. of conns: 0 Current no. of operations: 0 Server 0x1b1f330 (server2) Server: 192.168.56.112 Status: Slave, Running Protocol: MySQLBackend Port: 3306 Server Version: 10.0.19-MariaDB-1~trusty-log Node Id: 2 Master Id: 3 Slave Ids: Repl Depth: 1 Number of connections: 8 Current no. of conns: 1 Current no. of operations: 0 Server 0x1a7b2c0 (server3) Server: 192.168.56.113 Status: Master, Running Protocol: MySQLBackend Port: 3306 Server Version: 10.0.19-MariaDB-1~trusty-log Node Id: 3 Master Id: -1 Slave Ids: 1, 2 Repl Depth: 0 Number of connections: 2 Current no. of conns: 0 Current no. of operations: 0
Everything looks normal. Let's try failover by shutting down server3.
server3# service mysql stop * Stopping MariaDB database server mysqld [ OK ]
Let's check the server status again:
$ maxadmin -pmariadb "show servers" Server 0x1b1f440 (server1) Server: 192.168.56.111 Status: Slave, Running Protocol: MySQLBackend Port: 3306 Server Version: 10.0.19-MariaDB-1~trusty-log Node Id: 1 Master Id: 2 Slave Ids: Repl Depth: 1 Number of connections: 0 Current no. of conns: 0 Current no. of operations: 0 Server 0x1b1f330 (server2) Server: 192.168.56.112 Status: Master, Running Protocol: MySQLBackend Port: 3306 Server Version: 10.0.19-MariaDB-1~trusty-log Node Id: 2 Master Id: -1 Slave Ids: 1 Repl Depth: 0 Number of connections: 8 Current no. of conns: 1 Current no. of operations: 0 Server 0x1a7b2c0 (server3) Server: 192.168.56.113 Status: Down Protocol: MySQLBackend Port: 3306 Server Version: 10.0.19-MariaDB-1~trusty-log Node Id: 3 Master Id: -1 Slave Ids: Repl Depth: 0 Number of connections: 2 Current no. of conns: 0 Current no. of operations: 0
MariaDB Replication Manager has promoted server2 to be the new master, and server1 has been reslaved to server2. server3 is now marked as down. If you restart server3, it will be marked as "Running" but not as slave - to put it back in the cluster, you just need to repoint replication with GTID with this command: CHANGE MASTER TO MASTER_HOST='server1', MASTER_USE_GTID=CURRENT_POS; The failover script could handle this case as well, although it remains to be tested.
Tags: High AvailabilityMaxScaleReplication About the Author
Guillaume Lefranc is managing the MariaDB Remote DBA Services Team, delivering performance tuning and high availability services worldwide. He's a believer in DevOps culture, Agile software development, and Craft Brewing.