Skip to content
David Humphrey edited this page Jan 27, 2023 · 5 revisions

Overview

Development

Deployment

Docker can also be run in a load-balanced, clustered setup using Docker Swarm Mode. To start, each node in the cluster must have docker installed, and each node must be able to read the others over the network.

If you want to experiment with this walkthrough, a useful tool is Canonical's multipass. It allows you to quickly create Ubuntu VMs that can be assembled into a Docker cluster. It can be installed on any OS. Once installed, you can create as many Ubuntu VMs pre-configured with Docker as you like.

For example, to create 3 VMs with Docker installed named manager, worker1 and worker2:

$ multipass launch --name manager docker
Launched: mananger
$ multipass launch --name worker1 docker
Launched: worker1
$ multipass launch --name worker2 docker
Launched: worker2

To start a shell in any of these VMs, specify the name of the VM and use the shell subcommand:

$ multipass shell manager

When you are done using these VMs, you can clean them up like this:

$ multipass delete manager
$ multipass delete worker1
$ multipass delete worker2
$ multipass purge

Create a Swarm Cluster

A swarm has 1 or more manager nodes, which are in charge of the cluster, as well as various worker nodes, which run the swarm's containers. We begin by creating our manager node(s), then joining our workers to the manager(s).

Begin by obtaining the IP address of the manager node using ifconfig or ip or some other Linux command. For example, suppose our manager node is running at 192.168.64.11. If the nodes have multiple IP addresses (e.g., internal vs external networks), you want to choose the internal IP address (i.e., we'll use this to network between the nodes in the swarm).

Once you know the manager's IP address, use it to create the swarm and define the manager node:

# Needs to be run as root/sudo
$ sudo docker swarm init --advertise-addr 192.168.64.11
Swarm initialized: current node (z2kzrlomvm4f05ru94zksw5iu) is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-0pnx5m0x6seoezo5w1ihru2kjuffvmloqmq9uc0tqsx6uigjnt-daiis27rzreqzspzko70kijah 192.168.64.11:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

Here we seed that our node z2kzrlomvm4f05ru94zksw5iu has been added to a swam as the manager. We also see the command necessary to join a worker node to our swarm, including the necessary security token to use. The manager and worker nodes will communicate over TCP port 2377, which must be open internally in the firewall for this to work.

Add a Worker Node to a Swarm

Once we have our swarm created and manager node joined, we can begin to add our worker nodes. Each worker node needs to run the docker swarm join ... command displayed when we added the manager node:

# Needs to be run as root/sudo
$ sudo docker swarm join --token SWMTKN-1-5p759blc640c6akn3x2i36govqpt9oefqhefe877dhgcncglqe-5oj7je1dowj4rsuou3vlvil4r 192.168.64.11:2377
This node joined a swarm as a worker.

This process can be done immediately, or later on when more worker nodes are needed. If you need to do this in the future, run the following command to get the join token again (NOTE: you can request the token for adding more managers or workers):

# For adding another worker
$ sudo docker swarm join-token worker
To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-5p759blc640c6akn3x2i36govqpt9oefqhefe877dhgcncglqe-5oj7je1dowj4rsuou3vlvil4r 192.168.64.11:2377

# For adding another manager
$ sudo docker swarm join-token manager
To add a manager to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-5p759blc640c6akn3x2i36govqpt9oefqhefe877dhgcncglqe-2yyy06e2u9476cisor0yoy3ul 192.168.64.11:2377

The swarm can grow (or shrink) dynamically as needs change, and workloads will automatically be moved and load-balanced.

Listing Nodes in a Swarm

We can see the list of all nodes participating in the swarm by using docker node ls:

$ sudo docker node ls
ID                            HOSTNAME   STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
gm6z5pn7nunainw8foznrmxuy *   manager    Ready     Active         Leader           20.10.17
nrcr6mx1kks1ydvn1x6k76ih6     worker1    Ready     Active                          20.10.17

Here we see that we have one manager node (i.e., the Leader), and one non-manager (i.e., worker). Both nodes are Active.

Running a Container on a Swarm

When you want to run a container on the swarm, we first need to create a service using docker service create. We will give our service a name using --name, specify the ports to publish using --publish host:container and finally specify the image and version to use. We'll create an nginx server, and specifically use version 1.22 (we'll update it below):

# Needs to be run as root/sudo
$ sudo docker service create --name web --publish 8080:80 nginx:1.22
y7uv8fqc800xi184rctjxtyjt
overall progress: 1 out of 1 tasks 
1/1: running   
verify: Service converged

Here we created a new service in the swarm named web, which will map port 80 within the container to port 8080 in the host (i.e., swarm). Our web service uses the nginx web server image and the 1.22 tagged image. Running the command above causes docker to pull the nginx:1.22 image (if it doesn't already exist) and create a new container in the cluster on a free node.

We can discover where this container is running using docker service ps <service-name>:

# Needs to be run as root/sudo
$ docker service ps web
ID             NAME      IMAGE        NODE       DESIRED STATE   CURRENT STATE            ERROR     PORTS
eg8fwh5vrqnu   web.1     nginx:1.22   mananger   Running         Running 32 seconds ago 

Here we seed that we have 1 instance of the web container running (i.e., web.1), and that it is running on the manager node (i.e., our swarm's leader). We can also see that it is in the Running state.

We can access it via the URL of our swarm on the specified port (i.e., 8080):

$ curl http://192.168.64.11:8080/
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

Scaling a Service

Let's increase the number of parallel instances of our web service. Currently there is only 1, but we have 3 nodes in our cluster. Let's increase the number to 3:

$ docker service scale web=3
web scaled to 3
overall progress: 3 out of 3 tasks
1/3: running
2/3: running
3/3: running
verify: Service converged

We can look at our service again, and this time we'll see that there are 3 containers:

$ docker service ps web
ID             NAME      IMAGE        NODE       DESIRED STATE   CURRENT STATE            ERROR     PORTS
eg8fwh5vrqnu   web.1     nginx:1.22   mananger   Running         Running 5 minutes ago
79wgd0xcrgz3   web.2     nginx:1.22   worker2    Running         Running 54 seconds ago
6jxu33xjvrrn   web.3     nginx:1.22   worker1    Running         Running 53 seconds ago

Updating a Service with Zero Downtime

Previously we deployed nginx v1.22. Let's say that we need to ship a new version, v1.23, and that we want to do this update without bringing down the cluster. Docker can do rolling updates to a scaled service:

$ docker service update --image nginx:1.23 web
web
overall progress: 1 out of 3 tasks
1/3: running
2/3: preparing
3/3:

Here we see that the web service is being updated to use a new Docker image (nginx:1.23). We can also see that docker does a rolling update. The first instance is stopped, updated, then started. Once it reaches the RUNNING state, the next instance is updated, and so on. Eventually all instances are updated across the whole service.

We can see the result of this process once it completes:

$ docker service ps web
ID             NAME        IMAGE        NODE       DESIRED STATE   CURRENT STATE                 ERROR     PORTS
2czdq8ff0zoo   web.1       nginx:1.23   mananger   Running         Running about a minute ago
eg8fwh5vrqnu    \_ web.1   nginx:1.22   mananger   Shutdown        Shutdown about a minute ago
gl214hm2xbwh   web.2       nginx:1.23   worker2    Running         Running 2 minutes ago
79wgd0xcrgz3    \_ web.2   nginx:1.22   worker2    Shutdown        Shutdown 2 minutes ago
9oqbr2f5gqhh   web.3       nginx:1.23   worker1    Running         Running 2 minutes ago
6jxu33xjvrrn    \_ web.3   nginx:1.22   worker1    Shutdown        Shutdown 2 minutes ago

Here the new instances running nginx1.23 are 'Runningand the old instances runningnginx:1.22areShutdown`.

Using a Webhook to Update a Service

So far we've done everything manually. Let's automate the deployment process by using an HTTP webhook. We'll simulate using a CI/CD pipeline that needs to trigger a deploy in our cluster via an HTTP request.

To do this, we can use the webhook package. Start by installing it on the manager node:

$ sudo apt install webhook

Next, let's add a deploy.sh script that will update our web service to use a new Docker tag whenever the webhook is called. We'll expect the DOCKER_TAG environment variable to be set with the appropriate nginx tag (e.g., 1.22, 1.23, mainline, 1-alpine, etc).

#!/bin/sh

echo "Updating web service to use nginx:$DOCKER_TAG..."
docker service update --quiet --image nginx:$DOCKER_TAG web
echo "Finished Update."

Next, create a webhook config file in either JSON or YAML format. We'll create ~/hooks.json:

[
    {
        "id": "deploy",
        "execute-command": "./deploy.sh",
        "include-command-output-in-response": true,
        "pass-environment-to-command": [
            {
                "source": "url",
                "name": "tag",
                "envname": "DOCKER_TAG"
            }
        ]
    }
]

Here we create a new webhook that will be available at /hooks/deploy (i.e., using the id). We also pass our deploy.sh script as the command to execute, and specify that we want to pass a tag value taken from the URL to the command via the DOCKER_TAG environment variable.

Start the webhook running on the manger node:

$ webhook --hooks hooks.json --verbose --port 9999
[webhook] 2023/01/24 13:42:13 version 2.8.0 starting
[webhook] 2023/01/24 13:42:13 setting up os signal watcher
[webhook] 2023/01/24 13:42:13 attempting to load hooks from hooks.json
[webhook] 2023/01/24 13:42:13 found 1 hook(s) in file
[webhook] 2023/01/24 13:42:13 	loaded: deploy
[webhook] 2023/01/24 13:42:13 serving hooks on http://0.0.0.0:9999/hooks/{id}
[webhook] 2023/01/24 13:42:13 os signal watcher ready

The webhook is now listening on port 9999 of the manager node. From outside this node, we can trigger a redeploy and pass it the new image tag to use:

$ curl "http://192.168.64.11:9999/hooks/deploy?tag=mainline-alpine"
Updating web service to use nginx:mainline-alpine...
web
Finished Update.

On the manager node, the webhook debug messages also show that it worked:

[webhook] 2023/01/24 15:57:38 [8b835b] incoming HTTP GET request from 192.168.64.1:52533
[webhook] 2023/01/24 15:57:38 [8b835b] deploy got matched
[webhook] 2023/01/24 15:57:38 [8b835b] error parsing body payload due to unsupported content type header:
[webhook] 2023/01/24 15:57:38 [8b835b] deploy hook triggered successfully
[webhook] 2023/01/24 15:57:38 [8b835b] executing ./deploy.sh (./deploy.sh) with arguments ["./deploy.sh"] and environment [DOCKER_TAG=mainline-alpine] using  as cwd
[webhook] 2023/01/24 15:57:44 [8b835b] command output: Updating web service to use nginx:mainline-alpine...
web
Updated.

[webhook] 2023/01/24 15:57:44 [8b835b] finished handling deploy
[webhook] 2023/01/24 15:57:44 [8b835b] 200 | 66 B | 6.126765977s | 192.168.64.11:9999 | GET /hooks/deploy?tag=mainline-alpine

Before we can use this in production, we need to secure it behind HTTPS and also add a Hook Rule to check for a deploy token or meet some other secure criteria.