Skip to content

9.3 Configure High Availability in Kubernetes

Abstract

High Availability (HA) in Kubernetes removes single points of failure from the cluster control plane.

In production, HA is required because losing a single control plane node should not stop cluster management, scheduling, controller reconciliation, or API access.


Why High Availability Is Needed

If a single control plane node fails:

  • existing pods may continue running on worker nodes
  • users may still access running applications
  • kubectl access may fail because the API server is unavailable
  • failed pods may not be recreated
  • new pods may not be scheduled
  • controllers cannot reconcile desired state
  • cluster operations become unavailable

Warning

Running applications may survive a control plane failure temporarily, but the cluster cannot properly heal, scale, schedule, or accept management requests until the control plane is restored.


What HA Protects

A production HA Kubernetes design should provide redundancy for:

Area Why It Matters
Control plane nodes Avoid losing cluster management
API servers Keep kubectl and API access available
Controller managers Keep reconciliation running
Schedulers Keep pod scheduling available
etcd Protect cluster state
Worker nodes Keep workloads available
Load balancer Provide a stable API endpoint

Success

A highly available cluster avoids a single point of failure across both control plane and worker components.


Control Plane Components in HA

A control plane node commonly runs:

  • kube-apiserver
  • kube-controller-manager
  • kube-scheduler
  • etcd

In an HA setup, these components run on multiple control plane nodes.

Control Plane Node 1
  ├── kube-apiserver
  ├── kube-controller-manager
  ├── kube-scheduler
  └── etcd

Control Plane Node 2
  ├── kube-apiserver
  ├── kube-controller-manager
  ├── kube-scheduler
  └── etcd

API Server High Availability

The kube-apiserver can run in active-active mode.

That means multiple API servers can be running at the same time.

kubectl / API clients
Load Balancer :6443
 ┌───────────────┬───────────────┐
 │ API Server 1  │ API Server 2  │
 └───────────────┴───────────────┘

Note

API servers process requests independently, so multiple instances can safely run at the same time.


API Server Load Balancer

With multiple control plane nodes, clients should not connect directly to one master node.

Instead, configure a load balancer in front of API servers.

Example endpoint:

https://load-balancer:6443

Your kubeconfig should point to the load balancer:

clusters:
- name: production
  cluster:
    server: https://load-balancer:6443

Tip

Use a highly available load balancer such as HAProxy, NGINX, cloud load balancer, or a virtual IP solution.


Controller Manager and Scheduler HA

The kube-controller-manager and kube-scheduler must not actively perform the same work at the same time.

They run in active-standby mode using leader election.

Component HA Mode Reason
kube-apiserver Active-active Can process independent requests
kube-controller-manager Active-standby Prevent duplicate reconciliation
kube-scheduler Active-standby Prevent duplicate scheduling
etcd Distributed quorum Protect cluster state

Warning

If multiple schedulers or controller managers act as leaders at the same time, duplicate or conflicting actions may occur.


Leader Election

Leader election ensures only one controller manager or scheduler is active.

Typical flow:

  1. All instances start
  2. Each tries to acquire a lease
  3. One becomes the leader
  4. Others remain standby
  5. If the leader fails, another instance takes over

Example controller manager options:

kube-controller-manager \
  --leader-elect=true \
  --leader-elect-lease-duration=15s \
  --leader-elect-renew-deadline=10s \
  --leader-elect-retry-period=2s

Note

--leader-elect=true is enabled by default for control plane components that need leader election.


Leader Election Timing

Option Purpose Common Default
--leader-elect Enables leader election true
--leader-elect-lease-duration How long the leader holds the lease 15s
--leader-elect-renew-deadline How long the leader has to renew 10s
--leader-elect-retry-period How often standby instances retry 2s

Tip

Do not tune leader election values casually in production. Incorrect values can cause unnecessary failovers or slow recovery.


etcd in HA

etcd stores all Kubernetes cluster state.

Examples of data stored in etcd:

  • nodes
  • pods
  • deployments
  • services
  • secrets
  • configmaps
  • RBAC objects
  • cluster configuration

Danger

If etcd data is lost and no backup exists, the Kubernetes cluster state may be unrecoverable.


etcd Access from API Server

The kube-apiserver is the only Kubernetes control plane component that directly communicates with etcd.

Example API server configuration:

kube-apiserver \
  --etcd-servers=https://10.240.0.10:2379,https://10.240.0.11:2379,https://10.240.0.12:2379 \
  --etcd-cafile=/var/lib/kubernetes/ca.pem \
  --etcd-certfile=/var/lib/kubernetes/apiserver-etcd-client.crt \
  --etcd-keyfile=/var/lib/kubernetes/apiserver-etcd-client.key

Note

The API server can connect to any healthy etcd member from the configured list.


HA Control Plane Topologies

There are two common HA control plane topologies:

  1. Stacked etcd topology
  2. External etcd topology

Stacked etcd Topology

In a stacked topology, each control plane node also runs an etcd member.

Control Plane Node 1
  ├── kube-apiserver
  ├── kube-controller-manager
  ├── kube-scheduler
  └── etcd

Control Plane Node 2
  ├── kube-apiserver
  ├── kube-controller-manager
  ├── kube-scheduler
  └── etcd

Advantages

  • easier to set up
  • easier to manage
  • fewer servers required
  • common for smaller HA clusters

Disadvantages

  • control plane and etcd fail together on the same node
  • losing a node reduces both API capacity and etcd quorum
  • higher risk during failures

Warning

Stacked topology is simpler, but a failed node removes both a control plane instance and an etcd member.


External etcd Topology

In an external etcd topology, etcd runs on separate dedicated nodes.

Control Plane Nodes
  ├── kube-apiserver
  ├── kube-controller-manager
  └── kube-scheduler

External etcd Nodes
  └── etcd cluster

Advantages

  • less risky for etcd availability
  • control plane failure does not directly remove etcd members
  • better separation of responsibilities
  • preferred for stronger production isolation

Disadvantages

  • harder to set up
  • requires more servers
  • more certificates and networking configuration
  • more operational complexity

Success

External etcd topology is safer for critical production clusters because etcd is isolated from control plane node failures.


Stacked vs External etcd

Area Stacked etcd External etcd
Setup complexity Lower Higher
Server count Fewer More
Management effort Easier Harder
Failure isolation Lower Higher
Production safety Medium Higher
Best for Small/medium HA clusters Critical production clusters

Recommended Production Design

A simple production HA design includes:

Users / kubectl / API clients
Control Plane Load Balancer
 ┌────────────────┬────────────────┐
 │ master-01      │ master-02      │
 │ API Server     │ API Server     │
 │ Controller Mgr │ Controller Mgr │
 │ Scheduler      │ Scheduler      │
 │ etcd           │ etcd           │
 └────────────────┴────────────────┘
 ┌────────────────┬────────────────┐
 │ worker-01      │ worker-02      │
 └────────────────┴────────────────┘

Note

For stronger production reliability, use three or more control plane nodes and an odd number of etcd members.


Minimum Production HA Considerations

Component Recommendation
Control plane nodes At least 3 for stronger HA
etcd members Odd number, commonly 3 or 5
API endpoint Load balancer or virtual IP
Worker nodes Multiple workers across failure zones
etcd backup Scheduled and tested
Certificates Properly managed and rotated
Monitoring Required for API server, etcd, nodes, and controllers

Danger

A two-node etcd cluster is not ideal for production quorum. Prefer odd-numbered etcd membership.


Load Balancer Best Practices

Use a stable API endpoint:

https://k8s-api.company.com:6443

The load balancer should:

  • check API server health
  • forward TCP traffic on port 6443
  • avoid sending traffic to failed control plane nodes
  • be highly available itself
  • use DNS or VIP for a stable endpoint

Example HAProxy-style backend concept:

frontend kubernetes-api
  bind *:6443
  default_backend kube-apiserver

backend kube-apiserver
  server master1 master1:6443 check
  server master2 master2:6443 check
  server master3 master3:6443 check

Tip

The API load balancer is part of the control plane availability design. Do not make it a new single point of failure.


etcd Backup Best Practices

For self-managed clusters, back up etcd regularly.

Example command:

ETCDCTL_API=3 etcdctl snapshot save snapshot.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

Verify snapshot:

ETCDCTL_API=3 etcdctl snapshot status snapshot.db

Warning

Backups are only useful if restore procedures are tested.


Production Best Practices

Recommended

  • Use multiple control plane nodes
  • Place a load balancer in front of API servers
  • Use leader election for scheduler and controller manager
  • Use an odd number of etcd members
  • Back up etcd regularly
  • Test etcd restore procedures
  • Spread nodes across availability zones when possible
  • Monitor API server, etcd, scheduler, and controller manager
  • Keep certificates and kubeconfigs secure
  • Avoid running user workloads on control plane nodes
  • Use infrastructure as code for repeatable HA setup

Do's

  • Use HA for production clusters
  • Use a stable API endpoint in kubeconfig
  • Use multiple API servers behind a load balancer
  • Use at least three etcd members for reliable quorum
  • Monitor leader election and failover behavior
  • Protect etcd with TLS
  • Schedule etcd backups
  • Test failure scenarios before production rollout

Don'ts

  • Don't rely on a single control plane node in production
  • Don't point kubeconfig directly to one master node
  • Don't run etcd without backups
  • Don't use even-numbered etcd clusters for critical production
  • Don't expose etcd publicly
  • Don't disable leader election
  • Don't make the load balancer a single point of failure
  • Don't run production workloads on control plane nodes unless explicitly designed

Failure

HA is not just “adding another master.” You must also design API access, leader election, etcd quorum, backups, and failure recovery.


Common Failure Scenarios

Failure Impact HA Mitigation
One API server fails API requests still work Load balancer routes to healthy API servers
Active scheduler fails New pod scheduling pauses briefly Standby scheduler becomes leader
Active controller manager fails Reconciliation pauses briefly Standby controller manager becomes leader
One worker fails Pods on that worker fail ReplicaSets recreate pods elsewhere
One etcd member fails Cluster state still available if quorum remains etcd quorum
Load balancer fails API access may fail HA load balancer or VIP

Troubleshooting Commands

Check nodes:

kubectl get nodes -o wide

Check control plane pods:

kubectl get pods -n kube-system -o wide

Check component endpoints:

kubectl get endpoints -n kube-system

Check leader election leases:

kubectl get leases -n kube-system

Describe a lease:

kubectl describe lease kube-controller-manager -n kube-system
kubectl describe lease kube-scheduler -n kube-system

Check API server static pod manifest:

cat /etc/kubernetes/manifests/kube-apiserver.yaml

Check etcd members:

ETCDCTL_API=3 etcdctl member list

Check etcd health:

ETCDCTL_API=3 etcdctl endpoint health --cluster

Tip

In kubeadm clusters, control plane components usually run as static pods under /etc/kubernetes/manifests.


Summary

Quote

  • HA removes single points of failure from Kubernetes control plane
  • API servers run active-active behind a load balancer
  • Scheduler and controller manager run active-standby using leader election
  • etcd protects cluster state and requires quorum
  • Stacked topology is easier but has higher failure risk
  • External etcd topology is safer but more complex
  • Production clusters require backups, monitoring, secure certificates, and tested recovery