Setting up Etcd cluster with DNS discovery

Setting up an etcd cluster with DNS discovery may be challenging. There are several building blocks:

  • Etcd – a distributed key value store
  • Amazon EC2 – cloud computing provider
  • Cloudflare – DNS provider
  • Chef – for configuring individual nodes

Each of them has their pitfalls, we will guide you through whole process.

DNS discovery

Any clustered system needs a way to maintain a list of nodes in a cluster. Usually you need to specify all cluster members when starting a node. This is the way zookeeper and consul works. Effectively you have redundancy in configuration – the list of nodes is stored on every node. The list must be consistent and it’s difficult to maintain it especially if the cluster lives long life. Old nodes break and get remove, new nodes get added, cluster size may grow/shrink over time. All that makes cluster configuration maintenance cumbersome and error prone.

DNS discovery is a killer feature of Etcd. Effectively it means you keep a list of cluster nodes in one place. All interested parties – cluster nodes, clients, monitoring agents – get it from DNS. There is a single copy of the list, thus there is no chance to get inconsistent cluster view. Etcd team should advertise this feature with capital letters on their website.

Process overview

Three nodes will form the cluster. This is the minimal number of nodes in a cluster that is tolerant to one failure whether expected or not.

For each node we start Amazon EC2 instance and create a DNS record.

When all instances are ready and DNS is prepared, we start Etcd on three nodes simultaneously.

Starting Amazon EC2 instance

We will illustrate each step with extracts from real code excerpts. We will skip non-important parts of the code, so copy&paste won’t work.

INSTANCE_TYPE = "t2.micro"
KEYPAIR_NAME = "deployer"
SUBNET_ID = "subnet-12c7b638"   # private, infra
SECURITY_GROUP = "sg-f525808e"  # default VPC security group
ROOT_VOLUME_SIZE = 16
DATA_VOLUME_SIZE = 0
ZONE_NAME = "twindb.com"
DISCOVERY_SRV_DOMAIN = ZONE_NAME
SSH_OPTIONS = "-o StrictHostKeyChecking=no"
CLUSTER_SIZE = 3

Function setup_new_cluster() starts a cluster of a given size.
It calls function launch_ec2_instance() in a loop that in its turn starts an EC2 instance and waits until it is available via SSH.

def setup_new_cluster(size=CLUSTER_SIZE):

    Log.info("Initiating cluster with %d nodes" % size)

    nodes = []
    # Create file with private SSH key
    deployer_key_file = NamedTemporaryFile(bufsize=0)
    deployer_key_file.file.write(deployer_private_key)

    for i in range(size):
        node_id = uuid.uuid1().hex
        Log.info("Configuring node %s" % node_id)

        node_name = "etcd-%s" % node_id

        Log.info("Starting instance for node %s" % node_id)
        instance_id = launch_ec2_instance(AWS_DEFAULT_AMI, INSTANCE_TYPE, KEYPAIR_NAME, SECURITY_GROUP, SUBNET_ID,
                                          ROOT_VOLUME_SIZE, DATA_VOLUME_SIZE,
                                          get_instance_username_by_ami(AWS_DEFAULT_AMI), deployer_key_file.name,
                                          public=False, name=node_name)
        if not instance_id:
            Log.error("Failed to launch EC2 instance")
            exit(-1)

        Log.info("Launched instance %s" % instance_id)

Creating DNS records

Three DNS records must be created for each node :

  1. A record to resolve host name into IP address
  2. SRV record that tells what TCP port is used to serve other cluster nodes
  3. SRV record that tells what TCP port is used to serve client requests

Eventually we should get following DNS records:

A cluster node requests these records when it wants to know what other peers are and what ports they listen to.

$ dig +noall +answer SRV _etcd-server._tcp.twindb.com
_etcd-server._tcp.twindb.com. 299 IN SRV 0 0 2380 etcd-1e1650524ba511e68a9b12cb523caae1.twindb.com.
_etcd-server._tcp.twindb.com. 299 IN SRV 0 0 2380 etcd-6794c99e4ba411e68a9b12cb523caae1.twindb.com.
_etcd-server._tcp.twindb.com. 299 IN SRV 0 0 2380 etcd-c3e9dea04ba411e68a9b12cb523caae1.twindb.com.

If a client wants to communicate to the cluster it requests these SRV records to know what host name to connect to and to which port.

$ dig +noall +answer SRV _etcd-client._tcp.twindb.com
_etcd-client._tcp.twindb.com. 299 IN    SRV 0 0 2379 etcd-6794c99e4ba411e68a9b12cb523caae1.twindb.com.
_etcd-client._tcp.twindb.com. 299 IN    SRV 0 0 2379 etcd-1e1650524ba511e68a9b12cb523caae1.twindb.com.
_etcd-client._tcp.twindb.com. 299 IN    SRV 0 0 2379 etcd-c3e9dea04ba411e68a9b12cb523caae1.twindb.com.

And finally A records to resolve host names

$ dig +noall +answer etcd-6794c99e4ba411e68a9b12cb523caae1.twindb.com. etcd-1e1650524ba511e68a9b12cb523caae1.twindb.com. etcd-c3e9dea04ba411e68a9b12cb523caae1.twindb.com.
etcd-6794c99e4ba411e68a9b12cb523caae1.twindb.com. 299 IN A 10.5.1.81
etcd-1e1650524ba511e68a9b12cb523caae1.twindb.com. 299 IN A 10.5.1.66
etcd-c3e9dea04ba411e68a9b12cb523caae1.twindb.com. 299 IN A 10.5.1.203

At TwinDB we use CloudFlare to store twindb.com zone. CloudFlare provides API that we’re going to use.

def setup_new_cluster(size=CLUSTER_SIZE):
...
        dns_record_name = node_name + "." + DISCOVERY_SRV_DOMAIN
        private_ip = get_instance_private_ip(instance_id)
        if not create_dns_record(dns_record_name,
                                 ZONE_NAME,
                                 private_ip):
            Log.error("Failed to create an A DNS record for %s" % dns_record_name)
            exit(-1)

        if not create_dns_record("_etcd-server._tcp." + DISCOVERY_SRV_DOMAIN,  # "_etcd-server._tcp.twindb.com"
                                 ZONE_NAME,
                                 "0\t2380\t%s" % dns_record_name,
                                 data={
                                     "name": DISCOVERY_SRV_DOMAIN,
                                     "port": 2380,
                                     "priority": 0,
                                     "proto": "_tcp",
                                     "service": "_etcd-server",
                                     "target": dns_record_name,
                                     "weight": 0
                                 },
                                 record_type="SRV"):
            Log.error("Failed to create a SRV record for %s" % dns_record_name)
            Log.error("Trying to delete DNS record for %s" % dns_record_name)
            delete_dns_record(dns_record_name, ZONE_NAME)
            Log.error("Trying to terminate instance %s" % instance_id)
            terminate_ec2_instance(instance_id)
            exit(-1)

        if not create_dns_record("_etcd-client._tcp." + DISCOVERY_SRV_DOMAIN,
                                 ZONE_NAME,
                                 "0\t2379\t%s" % dns_record_name,
                                 data={
                                     "name": DISCOVERY_SRV_DOMAIN,
                                     "port": 2379,
                                     "priority": 0,
                                     "proto": "_tcp",
                                     "service": "_etcd-client",
                                     "target": dns_record_name,
                                     "weight": 0
                                 },
                                 record_type="SRV"):
            Log.error("Failed to create a SRV record for %s" % dns_record_name)
            Log.error("Trying to delete DNS record for %s" % dns_record_name)
            delete_dns_record(dns_record_name, ZONE_NAME)
            Log.error("Trying to terminate instance %s" % instance_id)
            terminate_ec2_instance(instance_id)
            exit(-1)

For the reference this is the code to work with CloudFlare API:

def cf_api_call(url, method="GET", data=None):

    cmd = ["curl", "--silent", "-X", method,
           "https://api.cloudflare.com/client/v4%s" % url,
           "-H", "X-Auth-Email: %s" % CLOUDFLARE_EMAIL,
           "-H", "X-Auth-Key: %s" % CLOUDFLARE_AUTH_KEY,
           "-H", "Content-Type: application/json"
           ]
    if data:
        cmd.append("--data")
        cmd.append(data)
    try:
        Log.debug("Executing: %r" % cmd)
        cf_process = Popen(cmd, stdout=PIPE, stderr=PIPE)
        cout, cerr = cf_process.communicate()

        if cf_process.returncode != 0:
            Log.error(cerr)
            return None

        try:
            Log.debug(cout)
            return json.loads(cout)

        except ValueError as err:
            Log.error(err)
            Log.error(cerr)
            return None

    except OSError as err:
        Log.error(err)
        return None


def create_dns_record(name, zone, content, data=None, record_type="A", ttl=1):

    zone_id = get_zone_id(zone)

    url = "/zones/%s/dns_records" % zone_id
    request = {
        "name": name,
        "content": content,
        "type": record_type,
        "ttl": ttl
    }

    if data:
        request["data"] = data

    response = cf_api_call(url, method="POST", data=json.dumps(request))

    if not response["success"]:
        for error in response["errors"]:
            Log.error("Error(%d): %s" % (error["code"], error["message"]))

    return bool(response["success"])

It takes time before DNS changes we made are propagated and available on a node. We should wait until DNS is ready:

def setup_new_cluster(size=CLUSTER_SIZE):
...
        # wait till dns_record_name resolves into private_ip
        Log.info("Waiting till DNS changes are propagated")
        while True:
            try:
                ip = socket.gethostbyname(dns_record_name)
                if ip == private_ip:
                    break
                else:
                    Log.error("%s resolved into unexpected %s" % (dns_record_name, ip))
            except socket.gaierror:
                Log.info("waiting...")
                pass
            time.sleep(3)

        # Save node in a list. We will need it later
        nodes.append({
            'key': deployer_key_file.name,
            'ip': private_ip,
            'name': node_name
        })

At this point we should have three Amazon EC2 instances up&running and DNS records ready.

Bootstrapping Etcd node

We use Chef recipe for etcd cluster. There are two gotchas with the recipe:

  1. By default it installs ancient Etcd version 2.2.5 which is buggy.
  2. The recipe installs an init script that will fail if you start the first node (See Bug#63 for details). By the way, I got no feedback from Chef team as of time of writing, but they didn’t forget to send me a bunch of sales cold calls and spam. Look up to Etcd team, they’re extremely responsive even on weekends.

Etcd recipe attributes

We need to specify only one attribute – domain name.

default['etcd-server']['discovery_srv'] = 'twindb.com'

Etcd recipe

etcd_installation 'default' do
  version '3.0.2'
  action :create
end

etcd_service node.default['chef_client']['config']['node_name'] do
  discovery_srv node.default['etcd-server']['discovery_srv']

  initial_advertise_peer_urls 'http://' + node.default['chef_client']['config']['node_name'] + '.twindb.com:2380'
  advertise_client_urls 'http://' + node.default['chef_client']['config']['node_name'] + '.twindb.com:2379'

  initial_cluster_token 'etcd-cluster-1'
  initial_cluster_state 'new'

  listen_client_urls 'http://0.0.0.0:2379'
  listen_peer_urls 'http://0.0.0.0:2380'
  data_dir '/var/lib/etcd'
  action :start
end

When recipe is ready (we use our own chef server) we can bootstrap three cluster nodes. Remember, we need to start them simultaneously.

def setup_new_cluster(size=CLUSTER_SIZE):
...
    pool = Pool(processes=size)
    pool.map(bootstrap_node, nodes)

Code to bootstrap one node:

def bootstrap(key, ip, node_name):

    try:
        username = get_instance_username_by_ami(AWS_DEFAULT_AMI)

        hosts_file = os.environ['HOME'] + "/.ssh/known_hosts"
        if isfile(hosts_file):
            run_command("ssh-keygen -f " + hosts_file + " -R " + ip)

        cmd = "knife bootstrap " + ip \
              + " --ssh-user " + username \
              + " --sudo --identity-file " + key \
              + " --node-name " + node_name \
              + " --yes " \
                " --run-list 'recipe[etcd-server]'"
        run_command(cmd)

    except CalledProcessError as err:
        Log.error(err.output)
        return False

    return True


def bootstrap_node(node):
    """
    Bootstrap node
    :param node: dictionary with node parameters. Dictionary must contain keys:
        key - path to SSH private key
        ip - IP address of the node
        name - node hostname
    :return: True if success or False otherwise
    """
    try:
        return bootstrap(node['key'], node['ip'], node['name'])
    except KeyError as err:
        Log.error(err)
        return False

Checking health of Etcd cluster

Now we can communicate with the Etcd cluster from any host with installed etcdctl:

$ etcdctl --discovery-srv twindb.com cluster-health
member 83062705e5ba24af is healthy: got healthy result from http://etcd-c3e9dea04ba411e68a9b12cb523caae1.twindb.com:2379
member 9fca41c9f65e3e96 is healthy: got healthy result from http://etcd-1e1650524ba511e68a9b12cb523caae1.twindb.com:2379
member b8dfb16b4af1fd49 is healthy: got healthy result from http://etcd-6794c99e4ba411e68a9b12cb523caae1.twindb.com:2379
cluster is healthy

Have a good service discovery!

The post Setting up Etcd cluster with DNS discovery appeared first on Backup and Data Recovery for MySQL.