Skip to content

7.05 Kubernetes Persistent Volumes

Abstract

Persistent Volumes (PV) are cluster-wide storage resources in Kubernetes.

They allow administrators to centrally manage storage and allow users to request storage through Persistent Volume Claims (PVCs).

Persistent Volumes are mainly used for stateful workloads such as databases, queues, and applications that must retain data.


Why Persistent Volumes Are Needed

In the previous topic, we used volumes directly inside a Pod definition.

That works for simple use cases, but it becomes difficult in large environments because:

  • Every Pod must define its own storage configuration
  • Users must know the storage backend details
  • Storage changes must be repeated across many Pod files
  • Storage is not centrally managed

Warning

Defining storage directly inside every Pod is not scalable for production Kubernetes clusters.


Persistent Volume Concept

A Persistent Volume is a storage resource created by an administrator.

It acts like a storage pool that application users can request from later using Persistent Volume Claims.

Flow:

Administrator
      ↓
Creates Persistent Volume
      ↓
Cluster Storage Pool
      ↓
User requests storage using PVC
      ↓
Pod uses claimed storage

Note

Persistent Volumes are cluster-level resources, not tied to one specific Pod.


PV vs Pod Volume

Feature Pod Volume Persistent Volume
Defined in Pod spec Separate Kubernetes object
Managed by Application user Cluster administrator
Lifecycle Tied closer to Pod Independent of Pod
Production usage Limited Recommended
Claimed by PVC No Yes

Tip

Use normal volumes for simple temporary use cases and Persistent Volumes for production storage.


Persistent Volume Access Modes

Access modes define how a volume can be mounted.

Access Mode Meaning
ReadOnlyMany Many nodes can mount as read-only
ReadWriteOnce One node can mount as read-write
ReadWriteMany Many nodes can mount as read-write

Note

Access mode support depends on the underlying storage backend.

For example:

  • AWS EBS usually supports ReadWriteOnce
  • NFS can support ReadWriteMany

Persistent Volume Example using hostPath

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-vol1

spec:
  accessModes:
    - ReadWriteOnce

  capacity:
    storage: 1Gi

  hostPath:
    path: /tmp/data

Create the Persistent Volume:

kubectl create -f pv-definition.yaml

View Persistent Volumes:

kubectl get persistentvolume

Short command:

kubectl get pv

Warning

hostPath is useful for learning and single-node testing, but it is not recommended for production multi-node clusters.


Persistent Volume Example using AWS EBS

For production cloud environments, use cloud-backed storage.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-vol1

spec:
  accessModes:
    - ReadWriteOnce

  capacity:
    storage: 1Gi

  awsElasticBlockStore:
    volumeID: <volume-id>
    fsType: ext4

Success

Cloud-backed Persistent Volumes are better suited for production workloads because data is not tied to a local node directory.


Persistent Volume Fields

Field Purpose
metadata.name Name of the PV
capacity.storage Amount of storage available
accessModes How the volume can be mounted
hostPath / cloud volume field Storage backend
persistentVolumeReclaimPolicy What happens after claim is released

Reclaim Policy

A reclaim policy controls what happens when a PVC releases the PV.

Common policies:

Policy Meaning
Retain Keep data after claim is deleted
Delete Delete storage resource after claim is deleted
Recycle Deprecated; do not use

Example:

persistentVolumeReclaimPolicy: Retain

Tip

Use Retain for important production data such as databases.


Production Best Practices

Recommended

  • Use Persistent Volumes and PVCs for stateful workloads
  • Use StorageClasses for dynamic provisioning
  • Prefer CSI drivers for modern Kubernetes storage
  • Use Retain policy for critical data
  • Enable encryption where supported
  • Back up volumes regularly
  • Monitor storage capacity and performance

Do's

  • Use PVs for persistent application data
  • Use PVCs to let users request storage safely
  • Use cloud or network storage in production
  • Choose access modes based on workload needs
  • Document reclaim policies clearly

Don'ts

  • Don't use hostPath for production application data
  • Don't assume all storage supports all access modes
  • Don't delete PVs without confirming data backup
  • Don't store sensitive data without encryption
  • Don't manually manage many static PVs when dynamic provisioning is available

Danger

Poor Persistent Volume configuration can cause data loss, failed scheduling, or application downtime.


Quick Commands

Create PV:

kubectl create -f pv-definition.yaml

List PVs:

kubectl get persistentvolume

Describe PV:

kubectl describe persistentvolume pv-vol1

Delete PV:

kubectl delete persistentvolume pv-vol1

Persistent Volume Status

Common PV statuses:

Status Meaning
Available PV is free and ready to be claimed
Bound PV is attached to a PVC
Released PVC deleted, but PV still has data
Failed PV has an issue

Note

A newly created PV usually shows status Available until a matching PVC claims it.


Summary

Quote

  • Persistent Volumes provide cluster-wide storage
  • Administrators create PVs as storage resources
  • Users request storage using PVCs
  • Access modes define how volumes can be mounted
  • hostPath is for learning, not production
  • Production clusters should use CSI-backed storage and StorageClasses