General Kubernetes Application/Deployment Issues

This article contains troubleshooting steps related to Kubernetes application and deployment issues.

Kubernetes Application Issues

See Troubleshooting Applications in the Kubernetes Documentation for instructions (link opens an external website in a new browser tab/window).

Kubernetes Pod Deployment Issues

See Troubleshooting Kubernetes Deployments for instructions (link opens an external website in a new browser tab/window).

Kubernetes Node Upgrade Issues

Kubernetes master node upgrade fails

Kubernetes upgrade assumes that pods are not scheduled to run on master nodes. Master nodes are therefore not drained during the Kubernetes upgrade process. This could cause the Kubernetes upgrade to fail if pods are running on master nodes.

By default, master nodes have a NoSchedule taint that prevents pods from being scheduled on them. This taint should not be removed. Also, pods should not have a NoSchedule toleration, as this would make is possible for them to run on master nodes, even when the NoSchedule taint is present. If Kubernetes upgrade fails because pods are running on master nodes, the NoSchedule taint should be reinstated on the master nodes if it has been removed and the NoSchedule toleration should be removed from any pods if it has been added. After pods are no longer running on master nodes, the Kubernetes upgrade operation should be run again.

Upgrade of a Kubernetes worker node fails

Kubernetes upgrade drains worker nodes before upgrading them. If it is not possible to drain a worker node, upgrade of that node may fail.

Do not configure resources such as persistent volume claims that prevent worker nodes from being drained. If Kubernetes upgrade fails on a worker node because the node couldn't be drained (the Status message for the host says failed to drain node) the resources which are preventing the node from being drained should be removed from the node and the upgrade retried for any failed worker nodes.

Upgrade fails because nodes are not drained

One of the following errors occurs when upgrading Kubernetes:

  • Unable to drain node "<K8s hostname>": "<K8s hostname> cordoned error: unable to drain node "<K8s hostname>\", aborting command.

  • There are pending nodes to be drained: <K8s hostname> error: cannot delete Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet (use --force to override): default/test-pvc-pod

HPE Ezmeral Runtime Enterprise does not force eviction of pods during the drain operation. It is likely the pod has a persistent volume (PV) attached, which is preventing pod eviction.

To complete the upgrade, manually remove the persistent volume claim (PVC) from that node.

NOTE

When running kubectl, enable the -v (verbose) option. For example:

kubectl -v=10 config current-context