Survival Guide for Docker & Docker Compose
You probably know what Docker is and how they are commonly used in practice, but are looking for a quick guide to navigate through the jungle. This article summarises the basics in Docker and Docker Compose to help you get started.
1. Who Should Read This?
You probably know what Docker is and how they are commonly used in practice, but are looking for a quick guide to navigate through the jungle. This article summarises the basics in Docker and Docker Compose to help you get started.
This is originally a note for Data Engineering Zoomcamp lesson 1, and over time I expanded it with additional references. Now I decided to package it as a stand-alone survival guide with a hope to help those in my situation, who intend to quickly get themselves up for this amazing tools.
2. Essential Docker Commands
docker run <image-name>:<tag> <command>
The command does the following in order:
- look up target image
<image-name>
of specified tag<tag>
locally, otherwise download it from Docker Hub
(if<tag>
is not specified, it assumes the tag to belatest
) - spin off a container from the downloaded image
- execute
<command>
in the running container, if provided - gracefully stop the container after the
<command>
finishes
Remarks:
- each execution of the command spins off independent container
- absolute path should be used if a path is needed in
docker run
Highlighted Flags
Flag | Description |
---|---|
--detach (or -d ) |
run the container in background |
--rm |
delete the container at the end of the command. useful if you intend to run the container once only |
--name |
specify a name to your container. useful for inter-container communication because other containers could address your container by its name instead of its IP address. |
--env (or -e ) |
set an environment variable in the container. useful when the service to be containerized requires environment variables, repeatedly add --env for multiple environment variables. |
-it |
attach to the session in the container at the end and enable you to interact with the session, useful when you have appended command and it |
--label (or -l ) |
add a meta label to the container. it’s useful if you want to group containers by labels, same as --env you can repeatedly add --label for multiple meta labels |
--volume (or -v ) |
mount a directory in localhost to a directory in the container e.g. --volume <local-directory>:<container-directory> |
--expose (or -p ) |
binds a port in local host to a port in container. useful for localhost to communicate with the containerized service e.g. --expose <local-port>:<container-port> |
--entrypoint |
run specified commands in the container, functionally similar to <command> in this case. |
Examples
docker run python:3.8 ls
: spin off a container from imagepython:3.8
and then list the folders & files at default directory of the containerdocker run --rm -d ython:3.8 python -m http.server
: start a http server at default directory of a container and run the container in background (running in background is useful here, otherwise your terminal session can’t resume unless you forcefully shut down the server). upon shutting down the container, it is automatically deleted (thanks to--rm
).docker run -it ubuntu bash
: spin off a container fromubuntu:latest
image. executebash
in the container and then attach to the associated bash session (thanks to-it
).docker run -it -e POSTGRES_USER="root" -e POSTGRES_PASSWORD="root" -e POSTGRES_DB="ny_taxi" -p 5432:5432 postgres:13
: spin off a container frompostgres:13
image and set 3 environment variables for the postgres database in the container. additionally the container exposes its port5432
to local host’s port5432
. then the container starts postgres server and we attach to the associated session.
docker container ls
This command lists out attributes for each container, by default it lists out running containers only. It is equivalent to docker ps
.
Highlighted Flags
Flag | Description |
---|---|
--filter (or -f ) |
lists out containers that satisfy the specified filter. you need to specify key and (optionally) value for filtering. add multiple --filter flags for multiple conditions |
--all (or -a ) |
lists out all containers (both stopped and running containers) |
--quiet (or -q ) |
returns container IDs only. particularly useful when you want to propagate a list of container ID to another command |
--format |
customizes the attributes you want to print out for the containers. |
--size |
additionally show the file size for each containers. attribute Size is exposed for you to specify in --format . it is set as an additional flag because it is costly to query file size for each container. |
Examples
docker container ls --all ---size --format "ID:{{.ID}} Image:{{.Image}} Size:{{.Size}}"
: print out all containers’ IDs, image names and file sizes. note that{{.Size}}
is only accessible with--size
flagdocker container ls --all --filter "ancestor=ubuntu" --filter "ancestor=python:3.8 --filter "label=version=1.3" -q
: print out ID of all containers whose image isubuntu:latest
OR whose image ispython:3.8
OR whose meta labels contain keyversion
of value1.3
.
docker container inspect <container-id>
You can pass one or more container names/ IDs to the command and it returns all metadata associated to those containers
Highlighted Flags
Flag | Description |
---|---|
--format (or -f ) |
functionally similar to --format from docker container ls , except you can access more attributes here. (e.g. NetworkSettings.Networks.bridge.IPAddress for container’s IP address) |
--size |
additionally expose attribute SizeRootFs in the metadata. it expresses container’s file size in bytes. |
Examples
docker inspect --size --format "{{.ID}} {{.Config.Image}} {{.SizeRootFs}} {{.NetworkSettings.Networks.bridge.IPAddress}}" $(docker ps -a -q)
: print out all containers’ IDs, associated image names, file sizes (in bytes) and IP addresses. you can’t accessSizeRootFs
attribute without--size
flag
docker exec <container-id> <command>
Execute a command <command>
in a running container <container-id>
Highlighted Flags
Flag | Description |
---|---|
--it |
attach to and interact with the session in the container after the command execution adasdasds |
--user (or -u ) |
execute the command as a specified user (username or UID) |
Examples
docker exec -it -u 0 <container-id> bash
: initiate a bash session as root user (UID 0) in a running container as root userdocker exec <container-id> pwd
: print out the default directory of a running container
Some Other Useful Docker Commands
docker start <container-id>
activate a stopped container
<container-id>
to run again
docker stop <container-id>
gracefully shut down a running container
<container-id>
docker kill <container-id>
forcefully shut down a running container
<container-id>
docker logs <container-id>
fetch the latest logs from
<container-id>
’s terminal for the command that it is running
docker cp <src-path> <container-id>:<dest-path>
transfer file from local host
<src-path>
to a directory in container<dest-path>
. to transfer a file from container to local host, usedocker cp <container-id>:<src-path> <dest-path>
instead
docker container rm <container-id>
Delete the stopped container
<container-id>
. add--force
(or-f
) to forcefully stop a running container and then delete it.
e.g.docker rm $(docker container ls --all --filter "ancestor=ubuntu" --quiet)
deletes all containers from the imageubuntu:latest
3. Customise Your Image with Dockerfile
Sometimes you may want to containerisze your own service. If you don’t find any base images from Docker Hub with all required dependencies pre-intsalled, you can create your custom image.
The workflow for creating and running a custom image is straight forward. You start by writing your own Dockerfile
. The file contains all configurations required to set up your container (e.g. commands to install required depencies, mounting, commands to start up the service … etc.). Once the Dockerfile
is ready, you can build a custom image from your Dockerfile
, and then spin off containers from it.
Step 1: Configure Your Dockerfile
Here are the instructions you can specified in Dockerfile
(some have their equivalent flags from docker run
). Note that you can access environment variables by $<env_var>
(or $(<env_var>)
) in Dockerfile
:
Instruction | Description |
---|---|
FROM |
base image that you want to build upon, you can specify a specific tag for the image (e.g. FROM python:3.8 ) |
RUN |
command you want to run in the container, usually for commands that install dependencies (e.g. RUN pip install -r requirements.txt ) |
WORKDIR |
set a working directory for any instructions that follow. if relative path is used, it will be relative to the WORKDIR from latest instruction |
COPY |
copy files or directory from local host to the container |
LABEL |
attach meta label to the contaienr, equivalent to --label (or -l ) in docker run (e.g. LABEL version=4.4 equivalent to docker ps --filter label=version=4.4 ) |
ENTRYPOINT |
default command you call in the container. it can’t be override by docker run |
CMD |
default command you call in the container, it can be overriden by the command appended to docker run . additionally when used with ENTRYPOINT , it serves as parameters to the command from ENTRYPOINT |
Here is an example for containerizing a data pipeline written in a Python script.
FROM sample-image:sample-tag
RUN pip install pandas
WORKDIR /working_dir
WORKDIR working_subdir
COPY pipeline.py pipeline.py
LABEL version="1.3"
LABEL create_date="2022-02-01"
ENTRYPOINT ["python", "pipeline.py"]
CMD ["2022-02-01"]
Step 2: Build an image from your Dockerfile
Build a custom image of name <image-name>
with tag <tag>
from your Dockerfile
:
docker build -t <image-name>:<tag> .
- note that rerunning the command overwrites the previous image
.
means readingDockerfile
in current directory- to check the image has been successfully built, do
docker images
to list out available images
Step 3: Fire off a container from your built image
Once the image is built, you can spin off a container from it:
docker run <image-name>:<tag>
4. Manage Your Containers Easily with Docker Compose
Sometimes it could be a pain to manage a group of containers with manual commands. Docker Compose is a tool to simplify the process. You can configure the services in docker-compose.yml
file, and then you can easily manage them (e.g. build images, spin off/ stop containers) with docker-compose
commands.
As a side note, Kubernetes (aka k8s) is another popular alternative. It supports automatic deployment, scaling and management of containerized services. However, k8s is beyond the scope of this guide.
Let’s say you want to package a Postgres server and a pgAdmin server (GUI tool for interacting with Postgres database) as separate containers. On top of that, the containerized pgAdmin should be able to connect with the containerized Postgres server and the containerized pgAdmin GUI should be accessible from local host. You can do the steps below to manage the containers.
Step 1: Configure Your docker-compose.yml
docker-compose.yml
defines all configurations you want to set for each service. Note that unlike docker run
, we can use relative path in docker-compose.yml
.
services:
pgdatabase:
image: postgres:13
environment:
- POSTGRES_USER=root
- POSTGRES_PASSWORD=root
- POSTGRES_DB=ny_taxi
volumes:
- "./ny_taxi_postgres_data:/var/lib/postgresql/data:rw"
ports:
- "5432:5432"
pgadmin:
image: dpage/pgadmin4
environment:
- PGADMIN_DEFAULT_EMAIL=admin@gmail.com
- PGADMIN_DEFAULT_PASSWORD=root
ports:
- "8080:80"
Step 2: Bring Up Containers
Once docker-compose.yml
is ready, you can run the following command in the same directory to bring up the containers. The command builds custom image and spin off container for each service:
docker-compose up
- similar to
docker run
, you can attach--detach
(or-d
) for running the group of containers in detached mode - by default a custom bridge network will be created for the group of containers to enable inter-container communication by IP address or name resolution
- unlike
docker build
, rerunningdocker-compose up
doesn’t rebuild an image even with updateddocker-compose.yaml
- to reflect your change in
docker-compose.yaml
, you can either:- apply
docker-compose up --build
to enforce re-building, or equivalently - apply
docker-compose build
beforedocker-compose up
- apply
- note that the container’s name will have additional prefix and suffix on top of what you specify in
docker-compose.yml
. the prefix is based on the name of the folder that you rundocker-compose up
. you can override the prefix by the flag--project-name
or-p
(i.e.docker-compose --project-name <some-prefix> up
)
Step 3: Bring Down Containers
Finally you can bring down what you have brought up from docker-compose up
by:
docker-compose down
- not only it shuts down the containers, it will also remove the associated custom bridge network and the containers
- in case there are residual containers remained, you can apply
docker-compose rm
in the same directory to erase them
5. References
- Youtube: DE Zoomcamp 1.2.1 - Introduction to Docker
- Docker Documentation: docker run
- Docker Documentation: docker container ls
- Github Issue: Docker PS filter by ancestor image does not match tagged images if image tag is omitted from filter · Issue #24295 · moby/moby
- Youtube: Manage Docker Easily With VS Code