Docker toolbelt
Table of Contents
- 1. Intro
- 2. What do we want?
- 3. When do we want it?
- 4. Why do we want it?
- 5. How are we going to do it?
- 6. What is a container?
- 7. ETOOMANYFLAGS
- 8. What are containers made of?
- 9. can I run 2 containers on the same filesystem?
- 10. patterns
- 11. Build
- 12. Dockerfile
- 13. images can extend other images
- 14. can I only "inherit" from an official image?
- 15. Examples?
- 16. How do we know it?
- 17. Let's play with docker inspect
- 18. Where/how are those images stored?
- 19. Dockerfiles are like
- 20. Running a container
- 21. Minimal flags
- 22. testing images and containers
- 23. Running multiple commands
- 24. Commit
- 25. Env vars
- 26. If I am a program running on a container, what do I see?
- 27. Can a container run more than one process?
- 28. Using cat from ubuntu container
- 29. mount a file and cat it from inside
- 30. Playing with scratch image
- 31. Entrypoint
- 32. Network
- 33. Deploy SPA Frontend and Backend
- 34. container names in a docker-compose
- 35. Traefik and whole whole shebang
- 36. can I join?
- 37. http_proxy, dns, and networks
- 38. Dok
- 39. gojira
- 40. ryu
- 41. mba
- 42. hma
- 43. Docker as a tool to help you hack on development
- 44. dev
- 45. see logs for a container
1 Intro
Hey! In this doc we're going to learn a few things about docker, and about development environments. Stay tuned.
2 What do we want?
The aim of this doc is to learn about docker and docker compose through a practical project. This project is a development environment generator. Which I'll be building with you as we go.
3 When do we want it?
ASAP
4 Why do we want it?
To have awesome pipelines, development environments, and to learn about containers.
5 How are we going to do it?
At first, it's going to be about docker. Then we'll use docker-compose, and finally we'll generate docker-compose files automatically, like there's no tomorrow.
6 What is a container?
Do you know chroot? ok, imagine you create a chroot and you run a
something inside it. You are going to run it successfully, and
that process is only going to have access to that part of the
filesystem. But, the process you're running, will it be able
to see other processes? what would htop
say? Also, what about
ping www.google.com
?
Containers is how we name cgroups and networking and lxc working together to isolate processes from other processes, or group them together.
7 ETOOMANYFLAGS
we'll use docker
. All container systems have too many
flags. There's a lot of things they can tune, but the amount of
flags is scary. There's no way around it.
8 What are containers made of?
A container needs a few things to run.
- filesystem. The same as you'd need an
.img
which is the "saved state" that the vm runner will use as a filesystem, container running systems need also a fake filesystem to use as its main filesystem. - networking. Different containers can be running at the same time,
and you can choose to make them see each other or not. Or you can
choose to give each one an IP on the same network as the host, or
not. You can even give them the
--net host
option so that all ports opened will live in the same host as the host machine. In that case, if you run 1 container that opens port 3000 and another that opens 4000, you can reach all services from all containrs, or even the host, usinglocalhost:XXXX
. - cpu
9 can I run 2 containers on the same filesystem?
you can run 2 containers from the same Image, making them 2 different running containers, and they should be independent from one another. They will have access to the same filesystem (except for the last layer, which is the RW one, which is independent for each container).
If you modify the filesystem, one container won't see the modification of the other, because they run "copies" of the Image filesystem. The shared layers are read only, so each container feels it runs in complete isolation (unless you mount parts of the filesystems from one to the other)
10 patterns
We'll start seeing different patterns, like small exercises, and
we'll start adding them up. It's a bit like the * schemer
style,
only worse (because I'm not Dan).
11 Build
docker build image-directory
will create an image from a list of
commands in a Dockerfile
.
Images are the filesystem of the container. Think a chroot, or think the .img or .vdi file you download to run virtualbox on it.
12 Dockerfile
A dockerfile contains the recipe to create the image and leave the image in your desired state, ready to be run later on a container.
It's like a shellscript, with some special commands, but it's an
imperative script that gets run by docker build
. Its goal is to
leave a filesystem in the state that it can run your app when
docker run img
.
Q: What runs the dockerfile? in which environment?
13 images can extend other images
Dockerfiles can be divided in 3 parts
FROM XXXYYYY # <- 1 From ..... # <- 2 Build more things ..... ..... ..... ..... CMD/ENTRYPOINT/RUN #<- run command
You usually start from an image that already has some contents in
its filesystem. That's why usually Dockerfiles start with the name
of a distro FROM ubuntu:20.04
. If you'd like a completely empty
filesystem to start from, there's an images called scratch
, that
looks like a recently formatted filesystem.
CMD/ENTRYPOINT/RUN part is used to give a default entrypoint or command to run when we run a new container from that image. The command can be overriden, but it's nice to have good defaults.
14 can I only "inherit" from an official image?
You can inherit from any image. Images can be in a container registry (usually on the internet), or locally, where you have a local collection of already downloaded images, ready to be used.
Also, when you're using docker build
, you are creating a new
image.
15 Examples?
echo <<EOF FROM ubuntu:20.04 EOF >Dockerfile docker build . .... Successfully build 7e0aa2e69a15
At this point, 7e0aa2e69a15 is the exact filestystem as the ubuntu:20.04. it's like subclassing a class without changing any method.
16 How do we know it?
diff <(docker inspect 7e0aa2e69a15) <(docker inspect ubuntu:20.04)
17 Let's play with docker inspect
Let's try to change the default command that our container will run.
echo <<EOF FROM ubuntu:20.04 CMD ["tail -f /dev/null"] EOF >Dockerfile docker build . .... Successfully build aabbccdd
Now, let's diff them:
diff <(docker inspect aabbccdd) <(docker inspect ubuntu:20.04)
Now we see the differences, and they mostly make sense. I guess you
now can be confident with inspect
. Everything should make sense.
18 Where/how are those images stored?
Instead of having a .iso, .img, or .vdk, images are stored as a directory with a bunch of data and metadata.
19 Dockerfiles are like
git clone ubuntu:20.04 --depth=1 cmd1 git add -A; git commit -m 'step1' cmd2 git add -A; git commit -m 'step2' ....
If you imagine 2 dockerfiles that are equal except in the last line, all but last lines are producing the same images. Docker is smart enough to share them, so the "blobs" (in git parlance) are unique (to keep the analogy working, you've got to obviate the fact that git would create different commits because they happen on different dates)
20 Running a container
Once we have the image filesystem, we're ready to bring its contents
to life. When we run a process inside that image, jailed in a
docker network (described in the docker run
command), we are
"starting" the container.
In that moment, you can think of a last ephemeral commit in that chain of commits being added. We could be modifying files there, and we would see them, but when we stop and kill the container, that layer would disappear.
21 Minimal flags
docker run --rm -ti image command
. Those flags the basic combination.--rm
tells it to clean after itself, removing whatever it created to make a container fromimage
.-ti
binds an interactive terminal, so we can communicate with it. You want that when running things locally 90% of the time.docker run --net=host
. This flattens all networking of that container to use the same ip as the host, so everything lives in the same machine, and can get to the other vialocalhost
. Problem with it is that ports may collide. just be aware it's a shortcut and you'll probably want to fix it at some point.docker run -v $PWD:/my-dir
mounts a directory from host to container.
22 testing images and containers
docker run --rm -ti ubuntu:20.04 bash
. This will open a bash
starting from an ubuntu:20.04 image.
In another terminal, run docker run --rm -ti ubuntu:20.04 bash
again.
You can see that both containers are running independently (files created in one are not seen in the other one).
And each one of them thinks it's unique. try to run top
in each
one of them. They shouldn't see each other.
But still, if you run docker ps
from the host, you can see both
containers run from the same image.
23 Running multiple commands
docker run --rm -ti ubuntu "apt update | apt install foo"
doesn't
work, but if you want to run several commands at the same time from
the "run" command, To test things out from a script, you can use
docker run --rm -ti ubuntu bash -ci "apt update && apt install net-tools"
24 Commit
I said that a running container has that ephemeral last layer, where
your modifications happen, and they survive a stop/start
, but they
don't survive a kill or rm.
But there's a way to make the current state permanent, and effectively create an image, from where new containers can be spawned or new Dockerfiles can start FROM.
docker run -ti ubuntu:20.04 bash # inside the container echo 'hi' >/tmp/foo
And in the host
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c6e13a79c7b3 ubuntu "bash" 15 seconds ago Up 14 seconds determined_wu
$ docker commit c6e13a79c7b3
sha256:d6581aae94a58d9be27b6fecf576630fead7a05b003be12796152110d7f0b010
$ docker run --rm -ti d6581aae94a58d9be27b6fecf576630fead7a05b cat /tmp/foo
hi
See? we created a new image, and we started a container from it
25 Env vars
Build time .env
, runtime -e
.
Env vars that appear in the compose can be overriden, others no (imagine if we could be able to mount and update LD_PRELOAD, not good).
- https://docs.docker.com/compose/environment-variables/
- https://stackoverflow.com/questions/43106459/environment-variable-assignment-in-docker-compose-colon-way
- https://docs.docker.com/compose/compose-file/compose-file-v3/
In projects/empty-dict-value there's an example of different ways to interact with them:
- docker-compose run -e foo=123 mything0 ; docker eec -ti … bash ; echo $foo
26 If I am a program running on a container, what do I see?
try it: We said there are mostly 3 things that get isolated: network, filesystem and processes.
26.1 Processes
Instead of bash, you can run any other program as the main one. For
example, docker run -ti --rm ubuntu top
. This will run top
,
and list all processes top
can see. Not many, really.
26.2 Network
Let's play with netstat. Netstat is not installd by default, so we'll have to install it each time.
# isolated docker run --rm -ti ubuntu bash -ci "apt update && apt install net-tools && netstat -atunp" # shared network docker run --net=host --rm -ti ubuntu bash -ci "apt update && apt install net-tools && netstat -atunp"
26.3 filesystem
You should be able to peek through directories by mounting volumes.
docker run --rm -ti -v /tmp:/mything ubuntu ls /mything
27 Can a container run more than one process?
Yes. Let's try something
docker run --rm -ti ubuntu bash # inside the container's shell top
There you see 2 processes, bash and top.
28 Using cat from ubuntu container
docker run --rm ubuntu cat /etc/lsb-release
29 mount a file and cat it from inside
docker run --rm -v /etc/lsb-release:/etc/lsb-release ubuntu cat /etc/lsb-release
30 Playing with scratch image
We'll create a filesystem starting from the absolute
minimum. scratch
, looks like an empty filesystem.
to do the bare minimum explorations possible, we'll get a busybox from our host and add it to the image.
cp /usr/bin/busybox ./busybox
FROM scratch COPY ./busybox /usr/bin/busybox COPY ./busybox /bin/sh CMD ["/usr/bin/busybox", "pwd"]
After that, docker build --tag bare-min .
to build the image, and
then you can start playing with docker run --rm -ti bare-min
If you try other command, like docker run --rm -ti bare-min ls
,
docker will complain that "ls" executable is not in $PATH. Here we
see that if we override the command from the shell, CMD is replaced
by what we entered. The way to fix it is to enter the full command, so
docker run --rm -ti bare-min /usr/bin/busybox ls
will work.
31 Entrypoint
A way to lock the main program that runs, but let users modify the
arguments to this, is to use ENTRYPOINT
32 Network
Footgun Alert:
- If you create a docker network via
docker network create foo
, the name stays exactlyfoo
- In a docker compose, the network is named after the project, so if
the structure is like:
myproject/docker-compose.yml
, and youdocker-compose up
and thendocker network ls
, you'll see amyproject_default
network. You can name networks in docker-compose:
version: '3.5' services: httpbin: # this should basically match docker_reverse_proxy.tf image: kennethreitz/httpbin ports: - 8888:80 networks: my_net: networks: my_net:
Run
docker network ls
and you'll see that the network is named….myproject_my_net
.In order to give it an absolute name, you can add a "name" to the network.
version: '3.5' services: httpbin: # this should basically match docker_reverse_proxy.tf image: kennethreitz/httpbin ports: - 8888:80 networks: my_net: networks: my_net: name: my_net
Now
docker network ls
will tell us that the network is calledmy_net
. For real :+1:
33 Deploy SPA Frontend and Backend
In an SPA that makes calls to the backend, the isolation that we would love in a docker-compose falls appart, because the external world has to have access to both frontend AND the API backend.
docker-compose can publish random free ports, but in this case, the frontend has to know beforehand what is the port number that backend is gonna use. That means that for the practical perspective, we have to publish "8888" in a fixed place. That means that we'll have a harder time running multiple instances of the stack (ports will collide unless you play around with passing env_vars with random ports to publish in the backend AND frontend).
+----------------------------------+ | Browser | | http://localhost:3000 | | js:ajax(http://localhost:8888) | | | +----------------------------------+ ^ ^ | 3000 | 8888 | | +-----------+------------------+---------+ | | +------+------+ | | +---------+-------+ | | | | | | | backend | | | | frontend | | | | | | | | | | | | | | | | | | | +-------------+ | | +-----------------+ | | | | +-------------+ | | | | | | | db | | | | | | | | | | | | | | | +-------------+ | | | | | +----------------------------------------+
34 container names in a docker-compose
A very similar situation to the network one happens with containers
themselves. That previous service httpbin
results in a container
named myproject_httpbin_1
. You can fix the name of the container with
container_name: httpbin1
. That way you can have global names.
35 Traefik and whole whole shebang
So, Traefik is this reverse proxy that is able to manage your containers in a fairly automatic way.
For every container in a docker network that you manage with traefik, traefik will check the container's labels, and check for traefik labels that are used for configuring the routes. It's kinda distributed in the sense that you don't go poke traefik app to configure it, but it detects when containers are upped and checks by itself.
It also has some default rules, so you can, for example, easily match the host header to redirect to the container_name.
But, how are the container names matched? is it the foo
, or
myproject_foo
? or myproject_foo_1
?
It turns out it's smart enough to figure out what you mean.
version: '3.5' services: traefik: image: traefik:v2.0 ports: - "8888:80" - "8080:8080" command: # https://doc.traefik.io/traefik/v2.0/providers/docker/#defaultrule # - --providers.docker.defaultRule=Host(`{{ normalize .Name }}` - --api.insecure=true - --providers.docker=true # - --entrypoints.web.address=:80 # - --providers.docker.network="my_net", volumes: - /var/run/docker.sock:/var/run/docker.sock networks: hosting: httpbin1: image: kennethreitz/httpbin # container_name: httpbin2 networks: hosting: labels: - traefik.enable=true - traefik.http.routers.httpbin1.rule=Host(`httpbin1`) # - traefik.http.routers.httpbin1.entrypoints=web networks: hosting: name: my_net
with this, and docker-compose up
, http :8888/status/202
Host:httpbin1
you can start trying things:
If you use a container_name
, the Host will try to match it
exactly. If there is no container_name
, traeffik still recognizes
Host with just the service name (httpbin1
in this case). If you
use myproject_httpbin_1
, it WON'T work, so it's DWIM, but it has
its own warts.
https://doc.traefik.io/traefik/user-guides/docker-compose/basic-example/ https://doc.traefik.io/traefik/v2.0/providers/docker/#defaultrule
36 can I join?
New containers can join the network, and will be playing the game without having been started on the same docker-compose.
docker run --rm --net my_net --name httpbin2 --hostname httpbin2 kennethreitz/httpbin
,http :8888/status/202 Host:httpbin2
37 http_proxy, dns, and networks
./projects/network_proxy contains a testing project that has:
- coredns with a fixed ip
- httpbin
- ubuntu (we'll use it as our shell)
- tinyproxy. Lives in the same network
tinyproxy_host lives in the host namespace, so it's accessible from localhost:8888 from the host, or host.docker.internal from the other containers.
We can start testing things around, by doing
docker-compose up
, and on another shell,docker-compose exec main bash
. you can run./script.sh
once inside
38 Dok
39 gojira
40 ryu
41 mba
42 hma
So far, the most minimalistic approach.
The goal for this project is to have a dev environment for our devs, with as little code besides the docker-compose.yml file. There are a few basic traits:
- Support for local non-commited customizations. The 3 files involved
are
hma.yml
hma.mac.yml
hma.override.yml
. First one is the main docker-compose file, the.mac.
has mac-only modifications, andhma.override.yml
is a local non-commited file that will be merged on top of the other two. Augment docker-compose, but keep docker-compose knowledge relevant and useful.
hma uses docker-compose under the hood, and proxies all commands to docker-compose. It adds the command
do
, that tries to be smart about the thing to execute, but all old docker-compose knowledge is directly applicable here.hma up
docker-compose -f hma.yml -f hma.mac.yml -f hma.override.yml up
hma exec ...
docker-compose -f hma.yml -f hma.mac.yml -f hma.override.yml exec ...
hma do
docker-compose -f hma.yml -f hma.mac.yml -f hma.override.yml exec <main_service> $HMA_CLI
hma do db
docker-compose -f hma.yml -f hma.mac.yml -f hma.override.yml exec db $HMA_CLI
hma do db bash
docker-compose -f hma.yml -f hma.mac.yml -f hma.override.yml exec db bash
- Sane defaults. The <main_service> is guessed by the name of the directory
Key points (tricks) of this approach:
42.1 Mount every container's root to ~.hma/.hma-home/
We'll mount every services' /root directory to a common directory in our host machine.
This way, all your hma containers will have a persistent $HOME,
meaning that you'll get persistent history, and a fully
customizable environment. Try modifying ~/.hma/.hma-home/.bashrc to
add a new alias, or add a script to ~/.hma/.hma-home/bin/
(and
add /root/bin to the $PATH in .bashrc
).
An extra benefit is that all containers share this space at the
same time, so you can copy files among containers and the host
without needing to remember docker cp
. It's just like a shared
drive.
volumes: - $HOME/.hma/.hma-home:/root:rw
42.2 Customize the main app's service
The main container needs a few modifications. Change the command
and/or entrypoint to a waiting loop like tail -f /dev/null
.
This way we keep the container alive and we can start our app
manually from within a hma do
, which will be an equivalent of
docker-compose exec $HMA_CLI
.
We should add the environment variable HMA_CLI
, pointing to
some meaninful entrypoint for our dev container.
Build a developer friendly container to run the app. This container can come FROM a microsoft devcontainer, or you can build your own. It should be dev-friendly to do some file navigation and basic development in.
The volumes to mount are the main one for the app code (where we'll
overwrite the working_dir
to), aws credentials, and of course,
the trick to mount the $HOME dir of the user to our "portal"
directory in our host.
services: #.... harbormaster: command: tail -f /dev/null build: context: $PWD/.devcontainer dockerfile: Dockerfile volumes: - $HOME/.aws/credentials:/root/.aws/credentials:rw - $PWD:/app/:rw - $HOME/.hma/.hma-home:/root:rw working_dir: /app environment: HMA_CLI: bash
42.3 Dockerfile
To prepare your main container, ($PWD/.devcontainer/Dockerfile in
the example above), you should leave space for extension on the
.override.yml
file. Here's an example:
FROM mcr.microsoft.com/vscode/devcontainers/java:11 ARG EXTRA_DEPS RUN apt-get update && apt-get -y install postgresql-client-12 $EXTRA_DEPS
With this trick, the final user will be able to install custom
packages by editing the hma.override.yml
harbormaster: build: args: EXTRA_DEPS: nvim iputils-ping httpie
42.4 Development
Since we mount the app directory from the host to the container, development can keep happening on the host, using your favourite editor.
In case of clojure there are a few tunnings more to do:
- Expose a port from the container to the host
- Configure lein/deps.edn to use that port for the repl.
go-to-definition
will break using this approach, because when your editor (emacs in my case) will ask for a function location through cider, it will get a path and location according to the machinery inside the container, that most likely won't match your external path.environment: LEIN_REPL_PORT: 7888 LEIN_REPL_HOST: "0.0.0.0" expose: - '7888' ports: - '7888:7888'
(setq cider-path-translations '(("/app/harbormaster/source" . "~/workspace/harbormaster") ("/app/metabase/source" . "~/workspace/metabase") ("/root/.m2/" . "~/.hma/.hma-home/.m2/")))
42.5 Code
#!/usr/bin/env bash die() { echo "$*" ; exit 1; } mkcd() { mkdir -p "$1"; cd "$1"; } pg-create-dbs-file() { cat <<'EOF' #!/bin/bash pg_conf_file=/var/lib/postgresql/data/postgresql.conf echo "\ log_statement = 'all' log_disconnections = off log_duration = on log_min_duration_statement = -1 shared_preload_libraries = 'pg_stat_statements' track_activity_query_size = 2048 pg_stat_statements.track = all pg_stat_statements.max = 10000 " >>$pg_conf_file for database in $(echo $POSTGRES_MULTIPLE_DATABASES | tr ',' ' '); do echo "Creating database $database" psql -U $POSTGRES_USER <<-EOSQL CREATE DATABASE $database; GRANT ALL PRIVILEGES ON DATABASE $database TO $POSTGRES_USER; CREATE EXTENSION IF NOT EXISTS pg_stat_statements; EOSQL psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" -d "$database" <<-EOSQL CREATE EXTENSION IF NOT EXISTS pg_stat_statements; EOSQL done EOF } hma-setup() { [ -f $1 ] && grep -siq 'HMA_CLI' $1 || die "Not an hma-compatible repo." [ -d $HOME/.mba/.mba-home/ ] || mkdir -p $HOME/.mba/.mba-home [ -d $HOME/.mba/resources/postgres/docker-entrypoint-initdb.d/ ] || ( mkcd $HOME/.mba/resources/postgres/docker-entrypoint-initdb.d/ && pg-create-dbs-file > 00-create-pg-db.sh && chmod +x 00-create-pg-db.sh ) } hma-exec () { local where=${1:-$(basename $PWD)} shift local cmd=${@:-'$HMA_CLI'} hma exec "$where" sh -lic "$cmd" } # hma up [maildev db] # hma down # hma do [service] [$HM_CLI] # hma do db # hma do db bash hma() { fname="hma" hma-setup "${fname}.yml" if [[ $1 == "do" ]]; then shift hma-exec "$@" else local files=("-f ${fname}.yml") [[ "$OSTYPE" == "darwin"* ]] && [ -f "./${fname}.mac.yml" ] && files+=("-f ./${fname}.mac.yml") [ -f "./${fname}.override.yml" ] && files+=("-f ./${fname}.override.yml") docker-compose ${files[*]} "$@" fi } hma "$@"
43 Docker as a tool to help you hack on development
44 dev
45 see logs for a container
https://stackoverflow.com/questions/41144589/how-to-redirect-docker-container-logs-to-a-single-file
docker inspect --format='{{.LogPath}}' containername