Node Failure Guide
A Node
failure can occur for many reasons, including hardware failures and network outages. This guide provides information about what to
expect when a Node
failure occurs and how to recover from a Node
failure. Recovery depends on the Storage Provisioner
and the type of storage that you use.
NOTE
This guide assumes that the storage provided in the cluster is physically separate from theNode
and is recoverable. It does not apply to a local storage on the Node
.
What to expect
By default, when a Node
fails:
-
It may take up to a minute for the failure to reflect in the Kubernetes API server and update the
Node
status toNotReady
. -
After about five minutes of the
Node
status beingNotReady
, the status of thePods
on thatNode
will be changed toUnknown
orNodeLost
. -
The status of the
Pods
with controllers, likeDaemonsets
,Statefulsets
, andDeployments
, will be changed toTerminating
.NOTE:
Pods
without a controller, started with aPodSpec
, will not be terminated. They must be manually deleted and recreated. -
New
Pods
will start on theNodes
that remain withReady
status.NOTE:
Statefulsets
are a special case. TheStatefulset
controller maintains an ordinal list ofPods
, one each for a given name. TheStatefulset
controller will not start a newPod
with the name of an existingPod
. -
The
Pods
that have associatedPersistent Volumes
of typeReadWriteOnce
, do not becomeReady
. This is because thePods
try to attach to the existing volumes that are still attached to the oldPod
, which is stillTerminating
. This happens because, at a given point, thePersistent Volumes
of typeReadWriteOnce
, can be associated only with a singleNode
, and the newPod
resides on anotherNode
. -
If multiple
Availability Domains
are used in the Kubernetes cluster and the failedNode
is the last in thatAvailability Domain
, then the existing volumes will no longer be reachable by newPods
in a separateAvailability Domain
.
About recovery
After a Node
fails, if the Node
can be recovered within five minutes, then the Pods
will return to a Running
state. If the Node
is not recovered after five minutes,
then the Pods
will complete termination and are deleted from the Kubernetes API server. New Pods
that have Persistent Volumes
of type ReadWriteOnce
,
will now be able to mount the Persistent Volumes
and change to Running
.
If a Node
cannot be recovered and is replaced, then deleting the Node
from the Kubernetes API server will terminate
the old Pods
and release the Persistent Volumes
of type ReadWriteOnce
, to be mounted by any new Pods
.
If multiple Availability Domains
are used in the Kubernetes cluster, then the replacement Node
should be added to the same Availability Domain
that the deleted Node
occupied. This allows the Pods
to be scheduled on the replacement Node
that can reach the Persistent Volumes
in that
Availability Domain
, and then the Pod
status is changed to Running
.
Do not forcefully delete Pods
or Persistent Volumes
in a failed Node
that you plan to recover or replace. If you force delete Pods
or Persistent Volumes
in a failed Node
,
it may lead to loss of data and, in the case of Statefulsets
, it may lead to split-brain scenarios. For more information about Statefulsets
,
see Force Delete StatefulSet Pods in the Kubernetes documentation.
You can force delete Pods
and Persistent Volumes
when a failed Node
cannot be recovered or replaced in the same Availability Domain
as
the original Node
.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.