Kubernetes Node Issues

This article contains troubleshooting steps related to Kubernetes nodes.

Symptom Logs to collect/Diagnostic steps
Unable to connect to the server: EOF Example:
# kubectl get nodes
                        Unable to connect to the server: EOF

On the Kubernetes Master: journalctl (or /var/log/messages)

Diagnostic Steps:

  • Check if local Kubernetes API server is responding or not.
  • Try running the kubectl command from a different client.
Kubernetes node failed to fetch join. Example: Controller: /var/log/bluedata/bds-mgmt.log: Feb 5 09:53:50 dl380-002 BDS: MGMT :[ error][ src/k8s/bd_mgmt_api_k8s.erl:01122] <0.32028.17> exception reason: {k8s_cluster_creation,["6","failed to fetch join command"]} It is very likely that controller and worker does not have network connectivity. Possible root causes:
  • Firewall
  • Mis-configured proxy setting -
  • Cloud (AWS) blocking traffic
  • Router issue
On the Controller: /var/log/bluedata/bds-mgmt.log
Failed to execute on building Kubernetes operator Example: Controller: /var/log/bluedata/bds-mgmt.log Failed to exec: kubectl -n hpecp create -f /opt/bluedata/bundles/bluedata-epic-entdoc-minimal-debug-5.0-3002/scripts/iucomponents/k8s_cluster/operator-templates/config-crs/cr-hpecp-config.yaml ERROR: Failed executing 06_operators.sh SKIPPING rollback Collect the Kubernetes events. On Kubernetes master node: kubectl get events