Multistage Dockerfiles: do we still need CI Software?

Over the years, I’ve used Jenkins, Concourse and a few other CI software. Recently, when the multistage dockerfile feature was released, it dawned on me that I used CI software for mainly 3 things: watching github repos, having a web UI to monitor builds, and be able to define pipelines.

The last one is the core selling point: defining pipelines allows me to design a build pipe where at the end, only the necessary stuff is included in the final docker container.

With multistage dockerfiles, we can now do that directly inside the dockerfile. We can define build steps, and at each step decide what is sent to the next one.

But then, is it still worth it to master a full fledge CI software just for git watching and web UI?

I finally hacked together a python/celery/flower/rabbitMQ stack to have out of the box: a task queue, a web ui, an api, and a python framework for tasks.

Here is the docker-compose.yml: (simplified version, don’t use as is)


version: '2'
services:
#Task queue
rabbit:
hostname: rabbit
image: rabbitmq:3.7.3
volumes:
- /home/ubuntu/applications_data/ci-rabbit-prod:/var/lib/rabbitmq #for data persistence
environment:
- RABBITMQ_DEFAULT_USER=${RABBITMQ_DEFAULT_USER}
- RABBITMQ_DEFAULT_PASS=${RABBITMQ_DEFAULT_PASS}

#Flower (task queue web UI + API)
ui:
image: 10.8.0.1:5500/ci-celery
command: flower -A worker --port=5555 --persistent=True --db=/db/db --broker=amqp://${RABBITMQ_DEFAULT_USER}:${RABBITMQ_DEFAULT_PASS}@rabbit
environment:
- BROKER=amqp://${RABBITMQ_DEFAULT_USER}:${RABBITMQ_DEFAULT_PASS}@rabbit
volumes:
- /home/ubuntu/applications_data/ci-flower-prod:/db #for data persistence
ports:
- "10.8.0.1:${CI_PORT}:5555"
links:
- rabbit
depends_on:
- rabbit

#Celery worker (picks up tasks from the queue)
aws-worker-1:
image: 10.8.0.1:5500/ci-celery
privileged: true
command: celery -A worker worker --loglevel=info -Ofair --concurrency=1
environment:
- BROKER=amqp://${RABBITMQ_DEFAULT_USER}:${RABBITMQ_DEFAULT_PASS}@rabbit
volumes:
- /home/ubuntu/applications_data/shared/logs/CI:/logs
- /home/ubuntu/applications_data/shared/Services-staging/git-watcher:/git-watcher
- /var/run/docker.sock:/var/run/docker.sock
- /usr/bin/docker:/usr/bin/docker
links:
- rabbit
depends_on:
- rabbit

NOTE: the ci-celery docker image is just a python 2.7 image with celery + flower installed.

I then wrote a celery task that runs docker builds in a given folder.

I also set up a simple nodeJS github watcher, and configured github webhooks to ping this watcher.

When a repo changes, the webhook pings the node service, which performs a local git pull.

It then adds a build task in the rabbitMQ queue, using the flower API. The celery worker then executes the task, which consists in copying the freshly pulled repo, running a docker build inside it, and pushing the image to our docker registry.

All this took me a day to setup and debug. We now have a blazingly fast, full docker-based CI system, with a nice web UI and API (thanks to Flower). And it all takes a “docker-compose up -d” to start it.

It is easy to maintain (60 or so lines of js for the nodejs git watcher, and approx. 100 lines of python for the celery task), and scales well (just need to add more celery workers in the docker-compose.yml).

Of course, all this (docker-compose.yml, nodejs service, python celery task) is versionned in github.

So far (a few weeks), this system has been performing very well, at scale, without any fuss, and is very fast. It is in line with the philosophy I try to apply everywhere: no over engineering, keep things simple, in docker containers.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s