Skip to content

7.04 Kubernetes Volumes

Abstract

Volumes provide storage for Pods in Kubernetes.

Pods are temporary by nature. If a Pod is deleted, the data inside the container is also lost unless it is written to a volume.

Volumes help preserve data beyond the container lifecycle.


Why Volumes Are Needed

Containers and Pods are designed to be ephemeral.

This means:

  • Pods can be created and deleted anytime
  • Containers may restart
  • Data inside the container filesystem is temporary
  • Application-generated data may be lost

Warning

Do not store important application data only inside the container filesystem.

Use volumes when data must survive container restarts or Pod deletion.


Docker vs Kubernetes Volumes

Platform Storage Problem Solution
Docker Container data is deleted with container Docker Volume
Kubernetes Pod data is deleted with Pod Kubernetes Volume

Note

Kubernetes Volumes are attached to Pods, not directly to standalone containers.


Basic Volume Flow

A Pod writes data into a mounted path.

Example:

Container path: /opt
Volume name: data-volume
Host path: /data

Flow:

Container writes file → /opt/number.out
Mounted Volume       → data-volume
Host Storage         → /data

Success

Even if the Pod is deleted, the file can remain on the configured storage backend.


Example: Pod with Volume Mount

The following Pod generates a random number and writes it to a file.

apiVersion: v1
kind: Pod
metadata:
  name: random-number-generator

spec:
  containers:
  - name: alpine
    image: alpine
    command: ["/bin/sh", "-c"]
    args:
      - shuf -i 0-100 -n 1 >> /opt/number.out;
    volumeMounts:
    - mountPath: /opt
      name: data-volume

  volumes:
  - name: data-volume
    hostPath:
      path: /data
      type: Directory

Note

The container writes to /opt, but the data is actually stored on the host path /data.


Volume and Mount Relationship

A Kubernetes volume has two parts:

Section Purpose
volumes Defines the storage backend
volumeMounts Mounts the volume inside the container

Example:

volumeMounts:
- mountPath: /opt
  name: data-volume

volumes:
- name: data-volume
  hostPath:
    path: /data
    type: Directory

Tip

The name under volumeMounts must match the name under volumes.


HostPath Volume

hostPath mounts a directory from the Kubernetes node into the Pod.

Example:

volumes:
- name: data-volume
  hostPath:
    path: /data
    type: Directory

Use case:

  • local testing
  • single-node clusters
  • node-level logs
  • system agents

Warning

hostPath is not recommended for normal application data in multi-node production clusters.


Why HostPath Is Risky in Multi-Node Clusters

In a multi-node cluster:

Node 1 → /data
Node 2 → /data
Node 3 → /data

These paths are not automatically the same storage.

If a Pod moves to another node, it may not find the same data.

Danger

Using hostPath for application data can cause data inconsistency or data loss in production.


Volume Types

Kubernetes supports many volume backends.

Examples:

  • hostPath
  • emptyDir
  • NFS
  • GlusterFS
  • CephFS
  • Flocker
  • Fibre Channel
  • AWS EBS
  • Azure Disk
  • Azure File
  • Google Persistent Disk

Note

In modern Kubernetes, production storage is usually managed using Persistent Volumes, Persistent Volume Claims, StorageClasses, and CSI drivers.


AWS EBS Volume Example

Instead of using hostPath, you can use a cloud disk such as AWS EBS.

volumes:
- name: data-volume
  awsElasticBlockStore:
    volumeID: <volume-id>
    fsType: ext4

This stores data on an AWS EBS volume instead of the node’s local filesystem.

Success

Cloud-backed storage is better suited for production workloads than local node storage.


Production Best Practices

Recommended

  • Use PersistentVolumes (PV) and PersistentVolumeClaims (PVC) for production
  • Use StorageClasses for dynamic provisioning
  • Prefer CSI-based storage drivers
  • Avoid hostPath for application data
  • Use cloud or network storage for stateful workloads
  • Back up critical volumes regularly
  • Use encryption for sensitive data

Do's

  • Use volumes for data that must survive container restarts
  • Use PVCs for production workloads
  • Use hostPath only for special cases
  • Use cloud storage for databases and stateful apps
  • Monitor disk usage and capacity

Don'ts

  • Don't store critical data only inside containers
  • Don't use hostPath for multi-node production apps
  • Don't assume /data is shared across nodes
  • Don't skip backup and recovery planning
  • Don't ignore storage performance limits

Danger

Stateful applications such as databases need carefully designed storage. Poor volume choices can lead to downtime or data loss.


Quick Commands

Create Pod:

kubectl apply -f pod.yaml

Check Pod:

kubectl get pods

Describe Pod:

kubectl describe pod random-number-generator

Check file on node:

ls /data

Volumes vs Persistent Volumes

Feature Volume Persistent Volume
Defined in Pod spec Cluster resource
Lifecycle Tied to Pod definition Independent of Pod
Production usage Limited Recommended
Dynamic provisioning No Yes, with StorageClass
Best for Simple use cases Stateful workloads

Tip

Kubernetes Volumes are a starting point. For production storage, move to Persistent Volumes and Persistent Volume Claims.


Summary

Quote

  • Pods are ephemeral and can lose local data
  • Volumes provide storage for Pods
  • volumeMounts mount volumes into containers
  • hostPath stores data on a specific node
  • hostPath is not recommended for multi-node production applications
  • Production workloads should use PVs, PVCs, StorageClasses, and CSI drivers