Multistage Dockerfiles: do we still need CI Software?

3 min readJul 25, 2018

Over the years, I’ve used Jenkins, Concourse and a few other CI software. Recently, when the multistage dockerfile feature was released, it dawned on me that I used CI software for mainly 3 things: watching github repos, having a web UI to monitor builds, and be able to define pipelines.

The last one is the core selling point: defining pipelines allows me to design a build pipe where at the end, only the necessary stuff is included in the final docker container.

With multistage dockerfiles, we can now do that directly inside the dockerfile. We can define build steps, and at each step decide what is sent to the next one.

But then, is it still worth it to master a full fledge CI software just for git watching and web UI?

I finally hacked together a python/celery/flower/rabbitMQ stack to have out of the box: a task queue, a web ui, an api, and a python framework for tasks.

Here is the docker-compose.yml: (simplified version, don’t use as is)

version: '2' services: #Task queue rabbit: hostname: 
rabbit 
image: rabbitmq:3.7.3 
volumes: 
- /home/ubuntu/applications_data/ci-rabbit-prod:/var/lib/rabbitmq #for data persistence environment: 
- RABBITMQ_DEFAULT_USER=${RABBITMQ_DEFAULT_USER} 
- RABBITMQ_DEFAULT_PASS=${RABBITMQ_DEFAULT_PASS} #Flower (task queue web UI + API) 
ui: image: 10.8.0.1:5500/ci-celery 
command: flower -A worker --port=5555 --persistent=True --db=/db/db --broker=amqp://${RABBITMQ_DEFAULT_USER}:${RABBITMQ_DEFAULT_PASS}@rabbit 
environment: 
- BROKER=amqp://${RABBITMQ_DEFAULT_USER}:${RABBITMQ_DEFAULT_PASS}@rabbit 
volumes: 
- /home/ubuntu/applications_data/ci-flower-prod:/db #for data persistence 
ports: 
- "10.8.0.1:${CI_PORT}:5555" 
links: 
- rabbit 
depends_on: 
- rabbit #Celery worker (picks up tasks from the queue) 
aws-worker-1: 
image: 10.8.0.1:5500/ci-celery 
privileged: true
command: celery -A worker worker --loglevel=info -Ofair --concurrency=1 
environment: - BROKER=amqp://${RABBITMQ_DEFAULT_USER}:${RABBITMQ_DEFAULT_PASS}@rabbit 
volumes: 
- /home/ubuntu/applications_data/shared/logs/CI:/logs 
- /home/ubuntu/applications_data/shared/Services-staging/git-watcher:/git-watcher 
- /var/run/docker.sock:/var/run/docker.sock 
- /usr/bin/docker:/usr/bin/docker 
links: 
- rabbit 
depends_on: 
- rabbit

NOTE: the ci-celery docker image is just a python 2.7 image with celery + flower installed.

I then wrote a celery task that runs docker builds in a given folder.

I also set up a simple nodeJS github watcher, and configured github webhooks to ping this watcher.

When a repo changes, the webhook pings the node service, which performs a local git pull.

It then adds a build task in the rabbitMQ queue, using the flower API. The celery worker then executes the task, which consists in copying the freshly pulled repo, running a docker build inside it, and pushing the image to our docker registry.

All this took me a day to setup and debug. We now have a blazingly fast, full docker-based CI system, with a nice web UI and API (thanks to Flower). And it all takes a “docker-compose up -d” to start it.

It is easy to maintain (60 or so lines of js for the nodejs git watcher, and approx. 100 lines of python for the celery task), and scales well (just need to add more celery workers in the docker-compose.yml).

Of course, all this (docker-compose.yml, nodejs service, python celery task) is versionned in github.

So far (a few weeks), this system has been performing very well, at scale, without any fuss, and is very fast. It is in line with the philosophy I try to apply everywhere: no over engineering, keep things simple, in docker containers.

Published July 25, 2018July 25, 2018

Originally published at fruty.io on July 25, 2018.

Multistage Dockerfiles: do we still need CI Software?

Written by françois Ruty

No responses yet