Notebooks
Describes how to identify and debug issues for Notebooks.
The Default User Jupyter Notebook Cannot Connect to Kubeflow
Couldn't find any information for the status of this notebook
This occurs when a username starts with a number, such as 3user
, because
notebooks cannot have names that start with a number.
When a user is added to HPE Ezmeral Unified Analytics Software, the system automatically creates a default notebook for the user and assigns the notebook a name in the following format:
<username>-notebook
If the username starts with a number, such as 3user
, the default user
notebook name also starts with a number (3user-notebook
), which is not
supported. When this occurs, Kubeflow does not recognize the notebook, due to the name, and
cannot connect.
Workaround
- Option 1
- Create a new notebook with the same image and configurations. Make sure that the
notebook name consists of lowercase alphanumeric characters, with or without dashes
(-) and does not start with a number. The name must start with a letter (a-z). For
example, you can name a notebook
my-notebook-1
, but you cannot name a notebook1-my-notebook
. - Option 2
- Ask your HPE Ezmeral Unified Analytics Software admin to delete the user account and then create a new one with
a username that adheres to the
Username Attribute
naming requirements, as described in AD/LDAP Servers.
“No healthy upstream” Error in Notebook Server Connection
When connecting to the notebook server, you may get the "no healthy
upstream"
error message due to an unhealthy notebook pod. To identify the issue,
you must check pod logs and events either using the Kubeflow UI or manually using the
kubectl
commands.
- Using Kubeflow UI
-
To access pod logs, events, and check the container status from the Kubeflow UI, follow these steps:
- Sign in to HPE Ezmeral Unified Analytics Software.
- Click the Tools & Frameworks icon on the left navigation bar.
- Navigate to the Kubeflow tile under the Data Science tab and click Open.
- In the Kubeflow Central Dashboard UI, click Notebooks on the left navigation bar.
- Click <your-unhealthy-notebook-name> to view the
notebook details.
- To check the current status of the container, click the OVERVIEW tab and look for the Conditions section. The Conditions section shows the current status of the container.
- To access pod logs, click the LOGS tab.
- To access pod events, click the EVENTS tab.
- Using
kubectl
Commands -
To access pod logs, events, and check the container status from the commandline, follow these steps:
- To get pod events and container statuses,
run:
Output:kubectl describe pod -n <user-ns> <notebook-name>-0
Name: temp-0 Namespace: hpedemo-user01 ......... temp: Container ID: Image: gcr.io/mapr-252711/kubeflow/notebooks/jupyter-tensorflow-full:ezaf-v1.8.0 Image ID: Port: 8888/TCP Host Port: 0/TCP State: Waiting Reason: PodInitializing Ready: False Restart Count: 0 ....... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 48s default-scheduler 0/6 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.. Warning FailedScheduling 46s default-scheduler 0/6 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.. Normal Scheduled 44s default-scheduler Successfully assigned hpedemo-user01/temp-0 to mip-bd-dev04.mip.storage.hpecorp.net Normal SuccessfulAttachVolume 44s attachdetach-controller AttachVolume.Attach succeeded for volume "mapr-pv-bd0db07c-4e43-4e78-8503-7f61649a7bd0" Normal Pulling 35s kubelet Pulling image "marketplace.us1.greenlake-hpe.com/ezua/istio/proxyv2:1.16.2" Normal Pulled 34s kubelet Successfully pulled image "marketplace.us1.greenlake-hpe.com/ezua/istio/proxyv2:1.16.2" in 1.127945155s (1.127954107s including waiting) Normal Created 34s kubelet Created container istio-validation Normal Started 34s kubelet Started container istio-validation Normal Pulling 33s kubelet Pulling image "marketplace.us1.greenlake-hpe.com/ezua/istio/proxyv2:1.16.2" Normal Pulled 29s kubelet Successfully pulled image "marketplace.us1.greenlake-hpe.com/ezua/istio/proxyv2:1.16.2" in 4.611252056s (4.611259156s including waiting) Normal Created 29s kubelet Created container istio-proxy Normal Started 28s kubelet Started container istio-proxy Normal Pulling 27s kubelet Pulling image "gcr.io/mapr-252711/kubeflow/notebooks/jupyter-tensorflow-full:ezaf-v1.8.0"
- To get pod logs,
run:
kubectl logs -n <user-ns> <notebook-name>-0
- To get pod events and container statuses,
run:
Result:
You can now identify the issue by checking pod logs, events, and the current status of the container.
Memory Accumulation and Unreleased Memory in Jupyter Notebooks
Memory consumption keeps increasing as Jupyter Notebooks are run. Even after closing the notebook, memory is not released which leads to a gradual accumulation of objects in memory with each notebook run. Eventually, the notebook server becomes unusable as memory reaches its limits and you are required to launch a new notebook server.
To release the memory, follow these steps to kill the kernels of closed notebooks:
- Sign in to HPE Ezmeral Unified Analytics Software.
- Click Notebooks icon on the left navigation bar of HPE Ezmeral Unified Analytics Software screen.
- Connect to the notebook server.
- Open the notebook you want to close.
- Click File in the menu bar.
- Select Close and Shutdown Notebook.
- Repeat the process for any other notebooks that are no longer in use.
Result:
By closing the notebooks using the Close and Shutdown Notebook option, you ensure that associated kernel is properly shut down which releases the memory it was using. This prevents the accumulation of objects in memory and keeps the notebook server usable for longer periods.
Specified Image Pull Policy Not Applied to a Pod
When you create a notebook server and set the imagePullPolicy
to
IfNotPresent
or Never
, the specified image pull policy
is not set to the pod. In both scenarios, the imagePullPolicy
is set to
Always
.
- Sign in to HPE Ezmeral Unified Analytics Software.
- Click Notebooks icon on the left navigation bar of HPE Ezmeral Unified Analytics Software screen.
- Click New Notebook Server. You will be navigated to the Kubeflow Notebooks UI.
- Enter the name of the notebook server.
- Click Custom Notebook.
- Click Advanced Options.
- Set Image pull policy to IfNotPresent.
- To launch the notebook server, click Launch.
- After creating the notebook server, click
<your-notebook-name> to view the notebook details.
- Click the YAML tab.
- Select Show the full YAML of the Pod.
- Locate the
imagePullPolicy
property for the image used in creating the notebook.
Result:
The imagePullPolicy
is set to Always
.