9.3 Configure High Availability in Kubernetes

Abstract

High Availability (HA) in Kubernetes removes single points of failure from the cluster control plane.

In production, HA is required because losing a single control plane node should not stop cluster management, scheduling, controller reconciliation, or API access.

Why High Availability Is Needed

If a single control plane node fails:

existing pods may continue running on worker nodes
users may still access running applications
kubectl access may fail because the API server is unavailable
failed pods may not be recreated
new pods may not be scheduled
controllers cannot reconcile desired state
cluster operations become unavailable

Warning

Running applications may survive a control plane failure temporarily, but the cluster cannot properly heal, scale, schedule, or accept management requests until the control plane is restored.

What HA Protects

A production HA Kubernetes design should provide redundancy for:

Area	Why It Matters
Control plane nodes	Avoid losing cluster management
API servers	Keep `kubectl` and API access available
Controller managers	Keep reconciliation running
Schedulers	Keep pod scheduling available
etcd	Protect cluster state
Worker nodes	Keep workloads available
Load balancer	Provide a stable API endpoint

Success

A highly available cluster avoids a single point of failure across both control plane and worker components.

Control Plane Components in HA

A control plane node commonly runs:

kube-apiserver
kube-controller-manager
kube-scheduler
etcd

In an HA setup, these components run on multiple control plane nodes.

Control Plane Node 1
  ├── kube-apiserver
  ├── kube-controller-manager
  ├── kube-scheduler
  └── etcd

Control Plane Node 2
  ├── kube-apiserver
  ├── kube-controller-manager
  ├── kube-scheduler
  └── etcd

API Server High Availability

The kube-apiserver can run in active-active mode.

That means multiple API servers can be running at the same time.

kubectl / API clients
        ↓
Load Balancer :6443
        ↓
 ┌───────────────┬───────────────┐
 │ API Server 1  │ API Server 2  │
 └───────────────┴───────────────┘

Note

API servers process requests independently, so multiple instances can safely run at the same time.

API Server Load Balancer

With multiple control plane nodes, clients should not connect directly to one master node.

Instead, configure a load balancer in front of API servers.

Example endpoint:

https://load-balancer:6443

Your kubeconfig should point to the load balancer:

clusters:
- name: production
  cluster:
    server: https://load-balancer:6443

Tip

Use a highly available load balancer such as HAProxy, NGINX, cloud load balancer, or a virtual IP solution.

Controller Manager and Scheduler HA

The kube-controller-manager and kube-scheduler must not actively perform the same work at the same time.

They run in active-standby mode using leader election.

Component	HA Mode	Reason
kube-apiserver	Active-active	Can process independent requests
kube-controller-manager	Active-standby	Prevent duplicate reconciliation
kube-scheduler	Active-standby	Prevent duplicate scheduling
etcd	Distributed quorum	Protect cluster state

Warning

If multiple schedulers or controller managers act as leaders at the same time, duplicate or conflicting actions may occur.

Leader Election

Leader election ensures only one controller manager or scheduler is active.

Typical flow:

All instances start
Each tries to acquire a lease
One becomes the leader
Others remain standby
If the leader fails, another instance takes over

Example controller manager options:

kube-controller-manager \
  --leader-elect=true \
  --leader-elect-lease-duration=15s \
  --leader-elect-renew-deadline=10s \
  --leader-elect-retry-period=2s

Note

--leader-elect=true is enabled by default for control plane components that need leader election.

Leader Election Timing

Option	Purpose	Common Default
`--leader-elect`	Enables leader election	`true`
`--leader-elect-lease-duration`	How long the leader holds the lease	`15s`
`--leader-elect-renew-deadline`	How long the leader has to renew	`10s`
`--leader-elect-retry-period`	How often standby instances retry	`2s`

Tip

Do not tune leader election values casually in production. Incorrect values can cause unnecessary failovers or slow recovery.

etcd in HA

etcd stores all Kubernetes cluster state.

Examples of data stored in etcd:

nodes
pods
deployments
services
secrets
configmaps
RBAC objects
cluster configuration

Danger

If etcd data is lost and no backup exists, the Kubernetes cluster state may be unrecoverable.

etcd Access from API Server

The kube-apiserver is the only Kubernetes control plane component that directly communicates with etcd.

Example API server configuration:

kube-apiserver \
  --etcd-servers=https://10.240.0.10:2379,https://10.240.0.11:2379,https://10.240.0.12:2379 \
  --etcd-cafile=/var/lib/kubernetes/ca.pem \
  --etcd-certfile=/var/lib/kubernetes/apiserver-etcd-client.crt \
  --etcd-keyfile=/var/lib/kubernetes/apiserver-etcd-client.key

Note

The API server can connect to any healthy etcd member from the configured list.

HA Control Plane Topologies

There are two common HA control plane topologies:

Stacked etcd topology
External etcd topology

Stacked etcd Topology

In a stacked topology, each control plane node also runs an etcd member.

Control Plane Node 1
  ├── kube-apiserver
  ├── kube-controller-manager
  ├── kube-scheduler
  └── etcd

Control Plane Node 2
  ├── kube-apiserver
  ├── kube-controller-manager
  ├── kube-scheduler
  └── etcd

Advantages

easier to set up
easier to manage
fewer servers required
common for smaller HA clusters

Disadvantages

control plane and etcd fail together on the same node
losing a node reduces both API capacity and etcd quorum
higher risk during failures

Warning

Stacked topology is simpler, but a failed node removes both a control plane instance and an etcd member.

External etcd Topology

In an external etcd topology, etcd runs on separate dedicated nodes.

Control Plane Nodes
  ├── kube-apiserver
  ├── kube-controller-manager
  └── kube-scheduler

External etcd Nodes
  └── etcd cluster

Advantages

less risky for etcd availability
control plane failure does not directly remove etcd members
better separation of responsibilities
preferred for stronger production isolation

Disadvantages

harder to set up
requires more servers
more certificates and networking configuration
more operational complexity

Success

External etcd topology is safer for critical production clusters because etcd is isolated from control plane node failures.

Stacked vs External etcd

Area	Stacked etcd	External etcd
Setup complexity	Lower	Higher
Server count	Fewer	More
Management effort	Easier	Harder
Failure isolation	Lower	Higher
Production safety	Medium	Higher
Best for	Small/medium HA clusters	Critical production clusters

Recommended Production Design

A simple production HA design includes:

Users / kubectl / API clients
        ↓
Control Plane Load Balancer
        ↓
 ┌────────────────┬────────────────┐
 │ master-01      │ master-02      │
 │ API Server     │ API Server     │
 │ Controller Mgr │ Controller Mgr │
 │ Scheduler      │ Scheduler      │
 │ etcd           │ etcd           │
 └────────────────┴────────────────┘
        ↓
 ┌────────────────┬────────────────┐
 │ worker-01      │ worker-02      │
 └────────────────┴────────────────┘

Note

For stronger production reliability, use three or more control plane nodes and an odd number of etcd members.

Minimum Production HA Considerations

Component	Recommendation
Control plane nodes	At least 3 for stronger HA
etcd members	Odd number, commonly 3 or 5
API endpoint	Load balancer or virtual IP
Worker nodes	Multiple workers across failure zones
etcd backup	Scheduled and tested
Certificates	Properly managed and rotated
Monitoring	Required for API server, etcd, nodes, and controllers

Danger

A two-node etcd cluster is not ideal for production quorum. Prefer odd-numbered etcd membership.

Load Balancer Best Practices

Use a stable API endpoint:

https://k8s-api.company.com:6443

The load balancer should:

check API server health
forward TCP traffic on port 6443
avoid sending traffic to failed control plane nodes
be highly available itself
use DNS or VIP for a stable endpoint

Example HAProxy-style backend concept:

frontend kubernetes-api
  bind *:6443
  default_backend kube-apiserver

backend kube-apiserver
  server master1 master1:6443 check
  server master2 master2:6443 check
  server master3 master3:6443 check

Tip

The API load balancer is part of the control plane availability design. Do not make it a new single point of failure.

etcd Backup Best Practices

For self-managed clusters, back up etcd regularly.

Example command:

ETCDCTL_API=3 etcdctl snapshot save snapshot.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

Verify snapshot:

ETCDCTL_API=3 etcdctl snapshot status snapshot.db

Warning

Backups are only useful if restore procedures are tested.

Production Best Practices

Recommended

Use multiple control plane nodes
Place a load balancer in front of API servers
Use leader election for scheduler and controller manager
Use an odd number of etcd members
Back up etcd regularly
Test etcd restore procedures
Spread nodes across availability zones when possible
Monitor API server, etcd, scheduler, and controller manager
Keep certificates and kubeconfigs secure
Avoid running user workloads on control plane nodes
Use infrastructure as code for repeatable HA setup

Do's

Use HA for production clusters
Use a stable API endpoint in kubeconfig
Use multiple API servers behind a load balancer
Use at least three etcd members for reliable quorum
Monitor leader election and failover behavior
Protect etcd with TLS
Schedule etcd backups
Test failure scenarios before production rollout

Don'ts

Don't rely on a single control plane node in production
Don't point kubeconfig directly to one master node
Don't run etcd without backups
Don't use even-numbered etcd clusters for critical production
Don't expose etcd publicly
Don't disable leader election
Don't make the load balancer a single point of failure
Don't run production workloads on control plane nodes unless explicitly designed

Failure

HA is not just “adding another master.” You must also design API access, leader election, etcd quorum, backups, and failure recovery.

Common Failure Scenarios

Failure	Impact	HA Mitigation
One API server fails	API requests still work	Load balancer routes to healthy API servers
Active scheduler fails	New pod scheduling pauses briefly	Standby scheduler becomes leader
Active controller manager fails	Reconciliation pauses briefly	Standby controller manager becomes leader
One worker fails	Pods on that worker fail	ReplicaSets recreate pods elsewhere
One etcd member fails	Cluster state still available if quorum remains	etcd quorum
Load balancer fails	API access may fail	HA load balancer or VIP

Troubleshooting Commands

Check nodes:

kubectl get nodes -o wide

Check control plane pods:

kubectl get pods -n kube-system -o wide

Check component endpoints:

kubectl get endpoints -n kube-system

Check leader election leases:

kubectl get leases -n kube-system

Describe a lease:

kubectl describe lease kube-controller-manager -n kube-system
kubectl describe lease kube-scheduler -n kube-system

Check API server static pod manifest:

cat /etc/kubernetes/manifests/kube-apiserver.yaml

Check etcd members:

ETCDCTL_API=3 etcdctl member list

Check etcd health:

ETCDCTL_API=3 etcdctl endpoint health --cluster

Tip

In kubeadm clusters, control plane components usually run as static pods under /etc/kubernetes/manifests.

Summary

Quote

HA removes single points of failure from Kubernetes control plane
API servers run active-active behind a load balancer
Scheduler and controller manager run active-standby using leader election
etcd protects cluster state and requires quorum
Stacked topology is easier but has higher failure risk
External etcd topology is safer but more complex
Production clusters require backups, monitoring, secure certificates, and tested recovery