I recently gave a presentation at Percona Live 2015 in Santa Clara, CA. In this presentaiton I originally wanted to simply show running MySQL replication, first asynchronous, and more importantly, a Galera cluster, and in so doing, demonstrate how useful Kubernetes is.
Why?
The talk was a good chance to introduce the MySQL community– developers, DBAs, sysadmins, and others to what Kubernetes is and what it means for MySQL
A bit of learning
I thought at the time when I submitted my synopsis that the talk would be straightforward. About 2-3 months ago, I started working on the setup I would use for the demonstration. My goal was to use a stock CoreOS cluster with the necessary Kubernetes components installed and running as a cluster.
The reality was that there was a bit more to it than that. Isn’t that how everything that has to do with complex systems is? To make a long story short, I tried the Vagrant setup for CoreOS but using the cloud-init scripts in Kubernetes documentation but I could never get complete success running a Kubernetes cluster this way. Hence, the blog post I recently published that covered my basic setup.
Finally, using the process outlined in that [post], I had a Kubernetes cluster that consistently worked for the most part. Some gotchas were that upon launching the cluster, the cloud-init scripts had dependencies that required downloading various binaries required to run Kubernetes and set up networking. A slow network connection resulted in failure because of this particular timing– something I plan to fix and contribute back to the community.
Asynchronous Replication
With a working Kubernetes cluster, I decided it was time to first start with regular MySQL asyncronous replication since it might present a more simple proof of concept. The way to do this was essentially to modify the standard MySQL Docker container to have a master and slave variant. The higher abstraction of this is that there will be two pods - a master pod, and a slave pod. For the master pod, only one container will run. The slave pod could run one or more containers.
The master container is built using a Dockerfile that specifies
and entrypoint shell script. This is the basic pattern that the
stock MySQL container uses, albeit only to set up essential MySQL
settings, particularly the root user and password. This
modifications to this entrypoint script sets up the replication
user privileges (name, password, and host to allow). In order to
do this, when the container is started, environment variables are
passed from the pod configuration file supplying the mysql root
password, replication username, and replication password. The
host to allow connection from uses 10.x.x.x as that’s the IP
range that Kubernetes uses to assigns to pods. This range would
cover any container in the slave pod(s) that would need to
connect as a slave. With these environment variables, the
entrypoint script builds up an SQL script that runs these
priviledge modification is run with MySQL in insecure mode (for
initialization) using mysqld
--initialize-insecure=on
. Additionally, a script called
“random.sh” runs to set the server-id value in my.cnf. Once the
master pod is running, a master service is started called
mysql_master which the great functionality of Kubernetes makes
availble as environment variables MYSQL_MASTER_HOST
and MYSQL_MASTER_PORT
on any container launched
there afterword, including the slave container.
From the entrypoint script:
echo "GRANT REPLICATION SLAVE, REPLICATION CLIENT on *.* TO '$MYSQL_REPLICATION_USER'@'10.100.%' IDENTIFIED BY '$MYSQL_REPLICATION_PASSWORD';" >> "$tempSqlFile"
The slave container is built similar to the master container with
regard to the Dockerfile specifying an entrypoint script, except
instead of setting up privileges, it sets up replication by
running the CHANGE MASTER...
in the sql script that
is built up using the aforementioned environment variables both
passed and available through Kubernetes
MYSQL_MASTER_HOST
which is the master the slave is
set up to read from.
From the entrypoint script:
if [ ! -z "$MYSQL_MASTER_SERVICE_HOST" ]; then
echo "STOP SLAVE;" >> "$tempSqlFile"
echo "CHANGE MASTER TO master_host='$MYSQL_MASTER_SERVICE_HOST', master_user='$MYSQL_REPLICATION_USER', master_password='$MYSQL_REPLICATION_PASSWORD';">> "$tempSqlFile"
echo "START SLAVE;" >> "$tempSqlFile"
fi
This actually is quite straightforward and worked the first time I prototyped it. I first ran it as two separate containers, passing the environment variables explicitly - like the example below:
docker run -e MYSQL_ROOT_PASSWORD=c-kr1t capttofu/mysql_master_kubernetes
docker run -e MYSQL_MASTER_SERVICE_HOST=x.x.x.x -e MYSQL_ROOT_PASSWORD=c-kr1t capttofu/mysql_slave_kubernetes
Once I verified this, it was a matter of creating master and slave pod files (to view follow links)
This proved the basic concept worked. That being, using an entrypoint script to set up the dabase in advance.
Galera replication
For Galera replication, it seemed it might actually be more
simple since when setting up Galera replication one need not
concern themselves with binary log position nor how to get a
snapshot of data– that being handled by Galera (SST - single
state transfer when joining). The difficulty was due to the fact
that services can only have a single port and IP using the
version of Kubernetes that I had to use for my demo. Galera
replication requires 4 ports: 3306, 4444, 4567, and 4568. In
newer versions of Kubernetes support multiple ports. The way I
planned to get around this is that I took advantage of the
read-only Kubernete API running on the host value found in the
enviroment variable $KUBERNETES_RO_SERVICE_HOST
on
every container Kubernetes starts (in a pod). The Kubernetes
client kubectl
is included on the Docker image. The
entrypoint script in turn runs kubectl
and parses
the output for every pod named “pxc_0”, iterating from 1 to 3, in
a loop, building up the string used for
wsrep_cluster_address
. Of course, if the container
is launched and the environment variable
WSREP_CLUSTER_ADDRESS
is set to
gcomm://
, then that value is used, in this case the
pod pxc_node1
, the “bootstrap” pod.
Galera replication is pretty simple once you know which hosts
will be part of the cluster. In this case, the pattern is to
launch the pxc_node1
pod as the bootstrap pod, then
pxc_node2
and pxc_node3
. When this is
completed, there should be a cluster.
Actual steps
First, set up a Kubernetes cluster per my blog post.
Pre-reqs
Build the kubernetes client program:
$ git clone https://github.com/GoogleCloudPlatform/kubernetes
$ cd kubernetes
kubernetes $ make
kubernetes $ sudo cp cmd/kubectl /usr/local/bin
Clone the kubernetes mysql replication repository
$ git clone https://github.com/CaptTofu/mysql_replication_kubernetes.git
$ cd mysql_replication_kubernetes
mysql_replication_kubernetes $ git submodule init
mysql_replication_kubernetes $ git submodule update
Create pxc_01 pod
mysql_replication_kubernetes $ cd galera_sync_replication
galera_sync_replication $ kubectl create -f pxc-node1.yaml
pxc-node1
Verify pod is running
galera_sync_replication $ kubectl get pods
POD IP CONTAINER(S) IMAGE(S) HOST LABELS STATUS CREATED
pxc-node1 10.244.78.2 pxc-node1 capttofu/percona_xtradb_cluster_5_6:latest 172.16.230.131/172.16.230.131 name=pxc-node1 Pending 5 Seconds
In the example above, the status is Pending
. Once
the status is Running
, create the second pod
Create pxc-node2 and pxc-node3 pod
Once pxc_node1 has a status of Running
, create
pxc_node2 and pxc_node3:
galera_sync_replication $ kubectl create -f pxc-node2.yaml
pxc-node2
galera_sync_replication $ kubectl create -f pxc-node3.yaml
pxc-node3
Create a service for pxc-node1
From before, recall that pxc-node1 is running on the kubernetes
minion/node with an IP address of 172.16.230.131. Edit the
configuration file for pxc_node1 service to make it possible to
connect to the pxc_node1 pod using that address with
publicIPs
. Edit pxc-node1-service.yaml:
---
id: pxc-node1
kind: Service
apiVersion: v1beta1
port: 3306
containerPort: 3306
selector:
name: pxc-node1
labels:
name: pxc-node1
publicIPs:
- 172.16.230.131
Once this file is ready, create the service
galera_sync_replication $ kubectl create -f pxc-node3.yaml
pxc-node3
Verify everything is running
There should be all three pods running (status
Running
) and a single pxc_node1 service:
galera_sync_replication $ kubectl get pods,services
POD IP CONTAINER(S) IMAGE(S) HOST LABELS STATUS CREATED
pxc-node1 10.244.78.2 pxc-node1 capttofu/percona_xtradb_cluster_5_6:latest 172.16.230.131/172.16.230.131 name=pxc-node1 Running About an hour
pxc-node2 10.244.75.2 pxc-node2 capttofu/percona_xtradb_cluster_5_6:latest 172.16.230.139/172.16.230.139 name=pxc-node2 Running About an hour
pxc-node3 10.244.11.2 pxc-node3 capttofu/percona_xtradb_cluster_5_6:latest 172.16.230.144/172.16.230.144 name=pxc-node3 Running 54 minutes
NAME LABELS SELECTOR IP PORT
kubernetes component=apiserver,provider=kubernetes <none> 10.100.0.2 443
kubernetes-ro component=apiserver,provider=kubernetes <none> 10.100.0.1 80
pxc-node1 name=pxc-node1 name=pxc-node1 10.100.43.123 3306
The output above shows that everything is up and running– time to connect to the database!
Access pxc-node1
service
Services are created immediately, so the database can be immediately accessed
$ mysql -u root -p -h 172.16.230.131
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.6.22-72.0-56 Percona XtraDB Cluster (GPL), Release rel72.0, Revision 978, WSREP version 25.8, wsrep_25.8.r4150
Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)]> show status like 'wsrep_inc%'
-> ;
+--------------------------+----------------------------------------------------+
| Variable_name | Value |
+--------------------------+----------------------------------------------------+
| wsrep_incoming_addresses | 10.244.78.2:3306,10.244.11.2:3306,10.244.75.2:3306 |
+--------------------------+----------------------------------------------------+
1 row in set (0.01 sec)
This output shows that all three Galera nodes are up and running!
Summary
With this proof of concept, there is much more to do. Most of all, it would be good to use replication controllers instead of simple pods to create the three galera single-container pods. That way, there is a means of ensuring that all pods will continue to run. It would also be good to demonstrate this proof-of-concept’s value by launching an application that uses this Galera cluster. At least at this point, there is something very useful to start with!
Special thanks to – Kelsey Hightower, Tim Hockin, Daniel Smith and others in #google-containers for their patience and excellent help!