Name: Tremplin Numérique
Price range: $$$

Pod scheduling issues are one of the most common Kubernetes errors. There are several reasons why a new pod can get stuck in a Pending state with FailedScheduling as his reason. A pod that shows this status will not start any containers, so you will not be able to use your application.

Pending pods caused by scheduling issues normally won't start without manual intervention. You will need to find the root cause and take action to repair your cluster. In this article, you'll learn how to diagnose and fix this problem so you can increase your workloads.

In this section:

Identifying a FailedScheduling Error

It is normal for pods to display a Pending status for a short time after adding them to your cluster. Kubernetes needs to schedule container instances on your nodes, and those nodes need to pull the image from its registry. The first sign of a pod scheduling failure is when it always shows up as Pending after the usual start-up period has elapsed. You can check the status by running Kubectl's get pods ordered:

$ kubectl get pods NAME READY STATUS RESTARTS AGE demo-pod 0/1 Pending 0 4m05s

demo-pod more than four minutes, but it is still in the Pending State. Pods don't usually take that long to start containers, so it's time to start investigating what Kubernetes expects.

The next diagnostic step is to retrieve the Pod's event history using the describe pod ordered:

$ kubectl describe pod demo-pod ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- ... Warning FailedScheduling 4m default- scheduler 0/4 nodes are available: 1 Too many pods, 3 Insufficient cpu.

The event log confirms a FailedScheduling the error is the reason for the extension Pending State. This event is reported when Kubernetes cannot allocate the required number of pods to one of your cluster's worker nodes.

The event message reveals why scheduling is currently not possible: there are four nodes in the cluster, but none of them can take the pod. Three of the nodes have insufficient CPU capacity while the other has reached a ceiling on the number of pods it can accept.

Understanding FailedScheduling Errors and Similar Issues

Kubernetes can only schedule pods on nodes with spare resources. Nodes that run out of CPU or memory can no longer take pods. Pods can also fail in scheduling if they explicitly request more resources than any one node can provide. This keeps your cluster stable.

The Kubernetes control plane knows which pods are already allocated to nodes in your cluster. It uses this information to determine the set of nodes that can receive a new pod. A scheduling error occurs when no candidate is available, leaving the pod stuck Pending until the ability is released.

Kubernetes may also not schedule pods for other reasons. Nodes can be deemed ineligible to host a Pod in several ways, even if they have adequate system resources:

The node may have been locked down by an administrator to prevent it from receiving new pods before a maintenance operation.
The node could have an effect that prevents pods from scheduling. Your pod will not be accepted by the node unless it has a matching tolerance.
Your pod may be requesting a hostPort which is already linked to the node. Nodes can only provide a particular port number to one Pod at a time.
Your pod may be using a nodeSelector this means that it must be programmed on a node with a particular tag. Untagged nodes will not be eligible.
The affinities and anti-affinities of pods and nodes can be unsatisfactory, causing a scheduling conflict that prevents new pods from being accepted.
The Pod may have a nodeName field that identifies a specific node to schedule. The pod will be stuck on hold if this node is offline or unscheduled.

It is the responsibility of kube-scheduler, the Kubernetes scheduler, to work through these conditions and identify the set of nodes that can host a new pod. A FailedScheduling The event occurs when none of the nodes meet the criteria.

Resolving Schedule Failed Status

The message displayed next to FailedScheduling usually reveal why each node in your cluster couldn't take the pod. You can use this information to start troubleshooting the problem. In the example above, the cluster had four pods, three where the CPU limit had been reached and one which had exceeded a pod count limit.

Cluster capacity is the root cause in this case. You can scale your cluster with new nodes to address hardware consumption issues, adding resources that will provide additional flexibility. As this will also increase your costs, it is worth checking first if you have redundant pods in your cluster. Removing unused resources will free up capacity for new ones.

You can inspect the resources available on each of your nodes using the describe node ordered:

$ kubectl describe node demo-node ... Allocated resources: (Total limits may be over 100 percent, ie, overcommitted.) Resource Requests Limits -------- -------- ---- -- cpu 812m (90%) 202m (22%) memory 905Mi (57%) 715Mi (45%) ephemeral-storage 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%)

The pods on this node are already requesting 57% of the available memory. If a new pod requested 1 Gi for itself, the node would not be able to accept the scheduling request. Monitoring this information for each of your nodes can help you assess whether your cluster is becoming over-provisioned. It is important to have spare capacity in case one of your nodes fails and its workloads need to be rescheduled on another.

Scheduling failures due to missing schedulable nodes will display a message similar to the following in the FailedScheduling Event:

0/4 nodes are available: 4 node(s) were unschedulable

Nodes that cannot be scheduled because they have been looped will include SchedulingDisabled in their status field:

$ kubectl get nodes NAME STATUS ROLES AGE VERSION node-1 Ready,SchedulingDisabled control-plane,master 26m v1.23.3

You can unlink the node to allow it to receive new pods:

$ kubectl uncordon node-1 node/node-1 uncordoned

When nodes are not closed and have sufficient resources, scheduling errors are usually caused by contamination or error nodeSelector field on your Pod. If you use nodeSelectorverify that you haven't made a typo and that there are pods in your cluster that have the labels you specified.

When nodes are contaminated, make sure you've included the corresponding tolerance in your pod's manifest. As an example, here's a node that's been contaminated so pods won't schedule unless they have a demo-taint: allow tolerance:

$ kubectl taint nodes node-1 demo-taint=allow:NoSchedule

Edit your pod manifests so they can schedule on the node:

spec:
  tolerances:
    - key: demo taint
      operator: Equal
      value: allow
      effect: NoSchedule

Solve the problem causing the FailedScheduling state will allow Kubernetes to resume scheduling your pending pods. They will start running automatically shortly after the control plane detects changes to your nodes. You don't need to restart or manually recreate your pods unless the problem is caused by errors in your pod manifest, such as incorrect affinity or nodeSelector the fields.

In this section:

FailedScheduling errors occur when Kubernetes cannot place a new pod on a node in your cluster. This is often caused by your existing nodes running out of hardware resources such as CPU, memory, and disk. When this is the case, you can fix the problem by scaling your cluster to include additional nodes.

Scheduling failures also occur when pods specify node affinities, anti-affinities, and selectors that currently cannot be satisfied by the available nodes in your cluster. Blocked and contaminated nodes further reduce the options available to Kubernetes. This type of problem can be solved by checking your manifests for typos in the labels and removing constraints that you no longer need.