1.09 Docker Storage Drivers & File Systems
Overview
Docker uses a layered file system to store images and container data efficiently. Storage drivers manage this layered architecture, enabling copy-on-write, layer sharing, and writable container layers.
Abstract
When Docker is installed, it creates a directory structure under /var/lib/docker to store all image layers, container data, and volumes. Images are built in read-only layers that are shared across containers. Each running container gets a thin writable layer on top. Storage drivers (AUFS, Overlay2, Device Mapper, etc.) handle the mechanics of this layering β determining performance, compatibility, and stability characteristics.
Why It Matters in Production
- Disk efficiency β shared image layers mean dozens of containers from the same image consume very little extra disk space
- Build speed β cached layers prevent redundant work on rebuilds; only changed layers are rebuilt
- Data persistence β container layers are ephemeral; volumes are required to persist data beyond a container's lifetime
- Storage driver choice β the wrong driver for your OS or workload can cause instability or poor I/O performance
Key Concepts
| Concept | Description |
|---|---|
| Image layers | Read-only layers created during docker build; shared across all containers using that image |
| Container layer | A thin read-write layer added on top of image layers when a container starts; destroyed with the container |
| Copy-on-Write (CoW) | When a container modifies a file from an image layer, Docker copies it to the container layer first; the original image layer is never changed |
| Volume mount | Mounts a Docker-managed volume from /var/lib/docker/volumes/ into a container |
| Bind mount | Mounts any arbitrary directory from the host filesystem into a container |
| Storage driver | The kernel-level component that manages layer creation, CoW, and the union filesystem |
Docker File System Layout
Docker stores all data under /var/lib/docker:
/var/lib/docker
βββ aufs/ # or overlay2/, devicemapper/ β storage driver data
βββ containers/ # per-container metadata and writable layer data
βββ image/ # image metadata and layer references
βββ volumes/ # named volumes created via docker volume create
Layered Architecture
Each instruction in a Dockerfile creates a new read-only layer containing only the delta from the previous layer:
FROM Ubuntu # Layer 1 β Base Ubuntu (~120 MB)
RUN apt-get update && apt-get -y install python # Layer 2 β apt packages (~306 MB)
RUN pip install flask flask-mysql # Layer 3 β pip packages (~6.3 MB)
COPY . /opt/source-code # Layer 4 β Source code (~229 B)
ENTRYPOINT FLASK_APP=/opt/source-code/app.py flask run # Layer 5 β Entrypoint (~0 B)
When a second application shares the same base layers, Docker reuses cached layers and only builds the differing ones β saving time and disk space:
Dockerfile 1 Dockerfile 2
βββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββ
Layer 1. Base Ubuntu 120 MB ββββΊ Layer 1. Base Ubuntu 0 MB (cached)
Layer 2. apt packages 306 MB ββββΊ Layer 2. apt packages 0 MB (cached)
Layer 3. pip packages 6.3 MB ββββΊ Layer 3. pip packages 0 MB (cached)
Layer 4. Source code 229 B Layer 4. Source code 229 B (different)
Layer 5. Entrypoint 0 B Layer 5. Entrypoint 0 B (different)
Container Layer & Copy-on-Write
When a container starts, Docker adds a writable Layer 6 (Container Layer) on top of the read-only image layers:
ββββββββββββββββββββββββββββββββ
β Layer 6 β Container Layer β β Read/Write (lives only as long as container)
β app.py (modified copy) β
β temp.txt (new file) β
ββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββ
β Layer 5 β Update Entrypoint β
β Layer 4 β Source code β β Read Only (image layers β never modified)
β Layer 3 β pip packages β
β Layer 2 β apt packages β
β Layer 1 β Base Ubuntu β
ββββββββββββββββββββββββββββββββ
Copy-on-Write in action:
- Container reads
app.pyfrom the image layer (read-only) - Container attempts to modify
app.py - Docker copies
app.pyinto the container layer - All subsequent modifications happen on the copy in the container layer
- The original image layer is never touched β other containers using the same image are unaffected
Volumes & Persistent Storage
Container layers are destroyed when the container is removed. Use volumes to persist data.
Volume Mount (Docker-managed)
Volume is stored at /var/lib/docker/volumes/data_volume/.
Bind Mount (Host path)
Preferred Syntax: --mount
The --mount flag is explicit and preferred over -v:
| Mount Type | Source | Use Case |
|---|---|---|
volume |
/var/lib/docker/volumes/ |
Portable, Docker-managed persistence |
bind |
Any host path | Dev workflows, existing data on host |
Storage Drivers
Storage drivers implement the union filesystem that makes layering possible. Docker selects the best available driver for the host OS automatically.
| Driver | Notes |
|---|---|
| overlay2 | Preferred driver on modern Linux (Ubuntu, Debian, CentOS 8+) |
| aufs | Legacy default on older Ubuntu; not available on all kernels |
| devicemapper | Used on older CentOS/RHEL where overlay2 is unavailable |
| btrfs | For hosts using the BTRFS filesystem |
| zfs | For hosts using ZFS |
Best Practices
Best Practices
- Use
overlay2on modern Linux hosts β it is the most performant and widely supported driver. - Order Dockerfile instructions from least-to-most frequently changing to maximise layer cache hits (base OS β dependencies β source code).
- Use named volumes for all stateful containers (databases, file uploads) β never rely on the container layer for persistence.
- Use
--mountinstead of-vin scripts and Compose files for clarity and explicit intent. - Keep image layers small β consolidate related
RUNcommands to avoid unnecessary intermediate layers.
Security Best Practices
Security
- Never store secrets in image layers β they are readable by anyone with access to the image, even if deleted in a later layer (use build secrets or inject at runtime).
- Bind mounts expose the host filesystem β only bind-mount directories the container actually needs; avoid mounting
/or/etc. - Use read-only bind mounts where the container only needs to read host data:
--mount type=bind,source=/config,target=/config,readonly. - Restrict volume access β in multi-tenant environments, ensure containers cannot access each other's volumes via misconfigured mounts.
- Scan image layers for CVEs β vulnerabilities in base layers affect all containers built from that image.
Do and Don't
| β Do | β Don't |
|---|---|
| Use named volumes for database containers | Store persistent data in the container layer |
| Order Dockerfile layers by change frequency | Put frequently changing files (source code) in early layers |
Use --mount for explicit, readable mount definitions |
Use -v shorthand in production scripts |
Use overlay2 on supported Linux hosts |
Override the storage driver without testing stability |
| Rebuild images to modify read-only layers | Attempt to edit image layer files directly on disk |
Common Mistakes
Common Mistakes
- Assuming container data persists β the container layer is deleted with the container; always use volumes for anything that must survive.
- Modifying files in image layers β changes made inside a running container go to the container layer, not the image. The image is unchanged until you run
docker buildagain. - Bind mounting broad host paths β mounting
/varor/homeinto containers is a security and stability risk. - Ignoring layer cache invalidation β placing
COPY . /appearly in a Dockerfile forces all subsequent layers to rebuild on every code change. - Not specifying volume mounts for databases β MySQL/Postgres data in a container layer is silently lost when the container is removed.
Troubleshooting
# Inspect storage driver and root directory
docker info | grep -E "Storage Driver|Docker Root Dir"
# List all volumes
docker volume ls
# Inspect a volume (find its mount path on host)
docker volume inspect data_volume
# Check disk usage by images, containers, and volumes
docker system df
# Remove unused volumes (reclaim disk space)
docker volume prune
# Inspect a container's mounts
docker inspect <container_name> | grep -A 20 '"Mounts"'
# Check layer cache β see intermediate image IDs
docker history <image_name>
Quick Recap
- Docker stores all data under
/var/lib/dockerβ images, containers, and volumes each in their own subdirectory - Images are built as stacked read-only layers; shared layers are cached and reused across images and containers
- Running a container adds a thin read-write container layer on top; this layer is destroyed when the container stops
- Copy-on-Write (CoW) allows containers to "modify" image files by copying them into the container layer first
- Volumes (named or bind) are required to persist data beyond a container's lifetime
--mountis the preferred, explicit syntax over-v- Storage drivers (overlay2, aufs, devicemapper, etc.) implement the layering; Docker picks the best one for the host OS
Interview / Revision Notes
- Where does Docker store its data?
/var/lib/dockerβ images inimage/, container data incontainers/, volumes involumes/ - What are image layers? Read-only layers created per Dockerfile instruction; shared between containers using the same image
- What is the container layer? A thin read-write layer added at
docker run; destroyed when the container is removed - What is Copy-on-Write? When a container modifies a read-only image file, Docker copies it to the container layer first; the original is never changed
- Difference between volume mount and bind mount? Volume mount uses Docker-managed storage under
/var/lib/docker/volumes/; bind mount maps any host path - Why use
--mountover-v? More verbose and explicit β easier to read and less error-prone in scripts - Default storage driver on Ubuntu?
overlay2on modern kernels;aufson older systems - How does layer caching speed up builds? Unchanged layers are reused from cache; only layers after the first change are rebuilt
- What happens to container data when a container is deleted? It is permanently lost β use volumes to persist data