Docker Fundamentals

Docker is the tool that popularised containers (from 2013) by wrapping Linux namespaces + cgroups + a layered filesystem behind a friendly CLI and image format. Today, “Docker” is three things that are easy to confuse: an engine (the daemon that runs containers), a CLI (the docker command), and an image registry ecosystem (Docker Hub + the image spec). Since 2015, all three have been standardised (Containers Fundamentals → OCI), so most of what you learn about Docker applies to Podman, containerd, nerdctl too.

Architecture in one picture

┌──────────────────────────────────────────────────────────────┐
│                         docker CLI                           │
│  (runs as your user, talks to the daemon over a Unix socket) │
└─────────────────┬────────────────────────────────────────────┘
                  │ REST API over /var/run/docker.sock
┌─────────────────▼────────────────────────────────────────────┐
│                       dockerd (daemon)                       │
│   image management, networking, volumes, build orchestration │
└─────────────────┬────────────────────────────────────────────┘
                  │ gRPC
┌─────────────────▼────────────────────────────────────────────┐
│                        containerd                            │
│           container lifecycle, image pull, snapshots         │
└─────────────────┬────────────────────────────────────────────┘
                  │ OCI runtime
┌─────────────────▼────────────────────────────────────────────┐
│                            runc                              │
│          clone()/unshare() → namespaces + cgroups            │
└──────────────────────────────────────────────────────────────┘
                  │
                  ▼
              Linux kernel

Key point: everything below containerd is shared with most other runtimes. What makes “Docker” is the daemon + CLI + build + image ecosystem.

Implication: on most servers you don’t need Docker — just containerd (plus nerdctl or crictl for CLI) runs images fine and is what Kubernetes uses.

Images, containers, layers

Image — an immutable, content-addressable bundle of files + metadata (entrypoint, env, cmd, exposed ports). Named repo:tag; identified by sha256:<hash>.
Container — a running (or stopped) instance of an image, with an ephemeral writeable layer on top. A container is to an image what a process is to a binary.
Layer — one filesystem diff. Images are stacks of layers, shared between images that share a base.

docker images                    # list local images
docker image inspect nginx:1.27  # metadata, layers, digest
docker history nginx:1.27        # layer-by-layer size breakdown

Daily CLI — the short list

Running containers

docker run --rm -it --name web -p 8080:80 nginx:1.27    # run interactive, remove on exit
docker run -d --name web -p 8080:80 nginx:1.27          # detached (background)
docker ps                                                # running
docker ps -a                                             # include stopped
docker logs -f web                                       # follow stdout/stderr
docker exec -it web bash                                 # shell into a running container
docker stop web                                          # graceful stop (SIGTERM → SIGKILL after 10s)
docker rm web                                            # remove stopped container
docker rm -f web                                         # stop + remove

Flags you’ll reach for constantly

Flag	Purpose
`-d`	Detached (daemonised)
`-it`	Interactive + TTY (for shells)
`--rm`	Delete container on exit
`--name`	Name it (else Docker makes up a cute one)
`-p host:container`	Publish port
`-v name_or_path:container_path`	Mount a volume / bind mount
`-e KEY=value`	Environment variable
`--env-file .env`	Env vars from file
`--network`	Attach to a specific network
`--restart unless-stopped`	Restart policy (no, on-failure, always, unless-stopped)
`--cpus=1.5 --memory=512m`	Resource limits
`--read-only --tmpfs /tmp`	Harden root fs
`-u 1000:1000`	Run as this UID:GID
`--cap-drop=ALL --cap-add=NET_BIND_SERVICE`	Drop caps, add only what’s needed

Working with images

docker pull nginx:1.27                       # fetch from registry
docker push ghcr.io/acme/myapp:sha-abc123    # push to registry
docker tag myapp:dev ghcr.io/acme/myapp:1.0  # re-tag an image
docker rmi old:tag                            # remove an image
docker image prune                            # clean up dangling
docker system prune -a --volumes              # nuclear: reclaim everything unused

Building images — Dockerfile

A Dockerfile is a sequence of instructions; each (mostly) creates a layer.

Instructions you’ll use

Instruction	Purpose
`FROM`	Base image (required, must be first)
`ARG`	Build-time variable (not in final image unless you `ENV` it)
`ENV`	Env var baked into image + present at runtime
`WORKDIR`	Set cwd; auto-created if missing
`COPY src dst`	Copy from build context into image
`ADD`	Like COPY but also extracts tars / fetches URLs — prefer `COPY`
`RUN cmd`	Run a command during build; creates a layer
`USER uid:gid`	Set the user for subsequent instructions + runtime
`EXPOSE 8080`	Documentation only (doesn’t publish)
`VOLUME /data`	Mark a path as a volume (becomes anonymous volume if not explicitly bound)
`ENTRYPOINT ["..."]`	The command that runs (exec form)
`CMD ["..."]`	Default args / command (overridable)
`HEALTHCHECK --interval=30s CMD curl -f ...`	Runtime health probe
`LABEL key=value`	Metadata

Multi-stage build (the pattern)

# syntax=docker/dockerfile:1.7
 
# ── build stage ──
FROM golang:1.22 AS build
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -o /out/app ./cmd/app
 
# ── runtime stage ──
FROM gcr.io/distroless/static-debian12
COPY --from=build /out/app /app
USER 65532:65532
ENTRYPOINT ["/app"]

Result: a ~15 MB image with no shell, no package manager, no compilers — just the binary. Enormous reduction in attack surface and download size.

Layer cache — the one thing to get right

Docker caches layers keyed by instruction + inputs. To maximise cache hits:

Copy dependency manifests first, install, then copy source.
Make the slowest-changing stuff early.
Use .dockerignore to keep the build context small.

# GOOD — deps cached unless requirements.txt changes
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
 
# BAD — any code change invalidates the pip install
COPY . .
RUN pip install -r requirements.txt

`.dockerignore`

.git
.venv
node_modules
__pycache__
**/*.log
.env*

Without this, COPY . . sends gigabytes of junk to the daemon every build.

BuildKit

BuildKit is the modern builder (default since Docker 23). It adds:

Parallel build steps (independent stages run concurrently)
Cache mounts: RUN --mount=type=cache,target=/root/.cache/pip pip install ...
Secret mounts: RUN --mount=type=secret,id=ssh ... (no secrets in layers)
--platform linux/amd64,linux/arm64 for multi-arch builds via docker buildx

Enable the # syntax=docker/dockerfile:1.7 header to use modern features.

Networking

Docker sets up networking per the model in Containers Fundamentals. The bits you adjust:

docker network ls                          # default: bridge, host, none
docker network create --driver bridge mynet
docker run --network mynet --name db postgres
docker run --network mynet --name app myapp   # app can reach db as hostname "db"

User-defined bridge networks enable DNS-based service discovery between containers by name. The default bridge network doesn’t. Always create a user network for multi-container setups.
--network host = no namespace; container uses host’s stack. No isolation, no port mapping needed.
--network none = no network at all.

Volumes vs bind mounts

# Volume (managed by Docker, lives under /var/lib/docker/volumes/)
docker volume create pgdata
docker run -v pgdata:/var/lib/postgresql/data postgres
 
# Bind mount (host path, you manage it)
docker run -v /srv/pgdata:/var/lib/postgresql/data postgres
 
# Bind your code into a dev container for live reload
docker run -v $(pwd):/app -w /app node:20 npm run dev

Type	Managed by	Good for
Named volume	Docker	Persistent data, backups, portability
Bind mount	You	Dev (host source → container), specific host paths
tmpfs	Kernel	Ephemeral, fast, RAM-backed scratch

Don’t put prod databases in bind mounts — volume drivers (NFS, EBS CSI) exist for a reason.

Docker Compose

Running 5 docker run commands with 20 flags each is awful. Compose (v2, Go CLI, docker compose subcommand) declares a multi-container app in YAML.

# compose.yaml
services:
  web:
    build: .
    ports: ["8080:80"]
    environment:
      DATABASE_URL: postgres://app:app@db:5432/app
    depends_on:
      db:
        condition: service_healthy
    develop:
      watch:
        - action: sync
          path: ./src
          target: /app/src
  db:
    image: postgres:16
    environment:
      POSTGRES_USER: app
      POSTGRES_PASSWORD: app
      POSTGRES_DB: app
    volumes:
      - pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U app"]
      interval: 5s
      retries: 10
 
volumes:
  pgdata:

docker compose up -d         # start
docker compose ps            # status
docker compose logs -f web   # follow logs
docker compose exec db psql -U app
docker compose down          # stop + remove
docker compose down -v       # also remove volumes

Compose is for local dev and small single-host deploys. For multi-host, you want Kubernetes (or Swarm, but Swarm is nearly abandoned).

Registries

An image registry is an HTTP service speaking the OCI Distribution spec.

Registry	Notes
Docker Hub	Default; free tier has pull rate limits — use authenticated pulls in CI
GitHub Container Registry (`ghcr.io`)	Free for public, tied to GitHub auth
AWS ECR / Azure ACR / Google GAR	Cloud-native with IAM integration
Harbor	Self-hosted, with RBAC, signing, vulnerability scanning
Artifactory / Nexus	Self-hosted, multi-format (images + npm + maven + …)

Authenticate: docker login <registry>. Tags are mutable by default — prefer immutable digests (image@sha256:...) in production.

Security — the default is not enough

Run-time:

Never --privileged unless you know why.
Drop caps: --cap-drop=ALL --cap-add=NET_BIND_SERVICE.
Non-root USER.
Read-only rootfs: --read-only --tmpfs /tmp.
Keep the default seccomp profile; pair with AppArmor / SELinux.

Build / supply chain:

Pin base images by digest, not floating tag.
Scan images — docker scout, Trivy, Grype. In CI on every build.
Sign images — Cosign + sigstore.
Minimise base — distroless, Alpine, scratch for statically linked binaries.
Multi-stage to drop compilers and toolchains from the final image.

Podman and friends — the daemonless alternative

Podman has Docker-compatible CLI (alias docker=podman mostly works) but:

No daemon — each podman invocation runs containers directly.
Rootless by default — runs as your user, using user namespaces.
Native systemd integration — podman generate systemd creates unit files for containers.
Pods (a la K8s) — multiple containers sharing a network namespace.

When to pick which:

Docker — most documentation, widest ecosystem, best dev-on-macOS experience.
Podman — server deployments where you want no long-running root daemon; RHEL default.
containerd + nerdctl — minimal runtime; what K8s uses.

Docker Desktop vs Docker Engine

Docker Desktop (Mac / Windows / Linux GUI app) — bundles a Linux VM, the daemon, K8s, BuildKit, a dashboard. Licensed: free for personal / small-biz, paid for large commercial.
Docker Engine — the open source daemon only. Runs natively on Linux.

On Mac/Windows, containers always run in a Linux VM — the kernel they share is the VM’s, not the host OS.

Common gotchas

Bind mount ownership on Mac/Windows. macOS Docker maps UIDs; file permissions inside the container look weird. Use named volumes where you can.
Time skew. Long-running VMs / containers drift; sync host NTP, not the container.
DNS weirdness inside the container — bridge DNS is Docker’s embedded resolver (127.0.0.11). Debug with nslookup from the container.
localhost inside a container = the container itself, not the host. Use host.docker.internal (Mac/Win/Desktop) or the bridge gateway IP on Linux.
Zombies. A container with a non-init PID 1 that spawns children doesn’t reap them. Use docker run --init to prepend tini as PID 1.
Logs filling the disk. Default JSON-file log driver rotates by size, but defaults are generous. Configure max-size, max-file in /etc/docker/daemon.json.

IT Knowledge DB

Explorer

Docker Fundamentals

Docker Fundamentals

Architecture in one picture

Images, containers, layers

Daily CLI — the short list

Running containers

Flags you’ll reach for constantly

Working with images

Building images — Dockerfile

Instructions you’ll use

Multi-stage build (the pattern)

Layer cache — the one thing to get right

`.dockerignore`

BuildKit

Networking

Volumes vs bind mounts

Docker Compose

Registries

Security — the default is not enough

Podman and friends — the daemonless alternative

Docker Desktop vs Docker Engine

Common gotchas

See also

Graph View

Table of Contents

Backlinks