Mesos DNS with Azure Container Service

The Azure Container Service (ACS) makes it easy to deploy a mesos cluster on Microsoft’s cloud. Unfortunately networking between containers on different slave nodes is a little more complex than when on your local machine. This problem is mostly solved using mesos-dns, but that still needs to be deployed on your cluster. In this brief tutorial I show a novel way of deploying this on your cluster and getting up and running in a minute or two.

The Docker Image

In order to make deploying mesos-dns a simple affair I created a custom docker image based off of mesosphere/mesos-dns that automatically detects up to 14 masters as created by Azure in your ACS deployment. This means that in most cases you can simply run the image and be ready to go.

Deploy Mesos DNS

After you have created your ACS cluster you’ll deploy the acs-mesos-dns image using marathon. This way, if mesos-dns crashes it will be automatically restarted. To do this get to the marathon management site for your cluster. Generally it is located on your mesos master on port 8080 (you may need to access this through your jumpbox, if you created one).

Once there, create a new application with the following parameters:

  • Id: mesos-dns
  • Instances: The number of slaves in your ACS cluster. Include any masters that are also slaves.
  • Image: mblouin/acs-mesos-dns
  • Network: bridged
  • Map TCP port 53 on the container and host

For reference, here’s the raw configuration:

{
  "type": "DOCKER",
  "volumes": [],
  "docker": {
    "image": "mblouin/acs-mesos-dns",
    "network": "BRIDGE",
    "portMappings": [
      {
        "containerPort": 53,
        "hostPort": 53,
        "servicePort": 53,
        "protocol": "udp"
      }
    ],
    "privileged": false,
    "parameters": [],
    "forcePullImage": false
  }
}

Click create, and while you wait for the deployment to complete let’s discuss why its a good idea to run mesos-dns on each slave. I think this is a good idea because:

  • You haven’t introduced hidden dependencies between your workloads and another server running mesos-dns. If you had only a single machine running it and that machine went down, then all of your applications would fail.
  • All requests are local.
  • No mesos-dns instance will need to handle large amounts of requests as you scale your cluster.

Those seem like solid benefits to me! Simply ensure that you apply the below slave DNS configuration to any new slaves you create, and that the marathon instance count is always equal to the number of slaves you have.

Configure Slave DNS

The last step is to configure slave DNS lookups to use mesos-dns. To do this enter the number of slaves you have in the NUM_SLAVES=2 command below, and run the bash script below. This will ssh into each slave and add 127.0.0.1 as a nameserver, causing it to connect to the local mesos-dns service:

Note that you’ll likely need to authenticate ssh for each slave. If you haven’t got your ssh keys on the jumpbox, consider my advice on setting up ACS ssh keys.

NUM_SLAVES=2
for i in $(seq 1 $NUM_SLAVES); \
    do ssh 10.0.0.$((i+19)) -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
    "echo \"nameserver 127.0.0.1\" | sudo tee -a /etc/resolvconf/resolv.conf.d/head && \
     echo \"search marathon.mesos mesos\" | sudo tee -a /etc/resolvconf/resolv.conf.d/base && sudo resolvconf -u"; done;

Congrats! Now all containers on your cluster will use mesos-dns by default. You can test this by running the following command on a slave machine:

ping mesos-dns.marathon.mesos
ping mesos-dns

Note that this will only work if you called the job mesos-dns in marathon. Otherwise change the name accordingly.

, , , ,

No comments yet.

Leave a Reply

Proudly made in Canada