Dashboard - Kubernetes Administrator

Platform Administrator users who have access to the Site Admin tenant can access the Kubernetes Administrator Dashboard screen by selecting Dashboard in the main menu. The Kubernetes Administrator Dashboard screen presents a high-level overview of current Kubernetes activity. (See Dashboard - Platform Administrator for information about the dashboard for EPIC Big Data tenants and AI/ML projects.)

The top of this screen contains the Refresh Data function, which displays the date and time of the most recent Dashboard refresh. Clicking the Refresh Data button refreshes the data on this screen.

The following tabs are available:

Usage: This tab displays usage information on a per-tenant basis. See Usage Tab.
Load: This tab displays load statistics for on-premises CPU, memory, and network resources within the deployment. See Load Tab.
Services: This section displays the health status for each component service within the deployment for each host. See Services Tab.
Alerts: This tab displays any alert messages generated by the system. See Alerts Tab.

Usage Tab

The Usage tab displays usage statistics for the Kubernetes clusters and tenants.

The top of the Usage tab displays dials showing the following aggregate information for all of the tenants in the deployment:

Cores Used: Percentage of available virtual CPU cores being used by all of the tenants in the deployment.
Memory Used (GB): Percentage of available RAM being used by all of the tenants in the deployment.
Ephemeral Storage Used (GB): Percentage of available ephemeral storage used and the total available persistent storage, in GB.
Persistent Storage Used (GB): Percentage of available persistent storage used and the total available persistent storage, in GB.
Tenant Storage Used (GB): Percentage of available tenant storage used and the total available persistent storage, in GB.
GPU Devices Used: Percentage of available GPU devices being used by all of the tenants in the deployment.
The bottom of this tab contains a table that lists all of the Kubernetes tenants in the deployment. This table displays the Tenant Name, Namespace, Cluster Name, Cores, Memory (GB), Ephemeral Storage (GB), Persistent Storage (GB), Tenant Storage (GB), and the number of Running Pods being used by that tenant. This number is expressed as x of y, where x is the allotted number and y is either the Tenant Quota or total System Resources, depending on your Show Usage against menu selection.

NOTE

For information about how to download detailed usage and uptime information in comma-delimited (.csv) format, see Downloading Kubernetes Usage Details.

Load Tab

The Load tab displays a series of dials and charts. Hovering the mouse over a bar opens a popup with more detailed information for the selected time.

This tab shows the following information for the selected time period:

Host CPU Utilization Percent: Percentage of host CPU utilization across all user space processes that are currently running for the selected host(s) over the selected time period. On multi-core systems, the percentages can be greater than 100%.
Host Memory Usage: Current use of host memory across all cluster processes for the selected host(s) over the selected time period.
Host Swap Memory Usage: Amount of swap file usage over the selected time period for the selected host(s) over the selected time period, in GB.
Host System Load: One-minute average system load percentage for the selected host(s) over the selected time period.

Host Network Traffic (Bytes In): Amount of incoming host network bandwidth being used by the selected host(s) over the selected time period.
Host Network Traffic (Bytes Out): Amount of outgoing host network bandwidth being used by the selected host(s) over the selected time period.

The following additional information applies to tenants with GPUs enabled:

GPU Utilization (percent): Selecting All hosts in the left pull-down menu displays aggregate GPU utilization in percent per host. Selecting an individual host displays per-GPU utilization for that host.
GPU Memory Usage: Selecting All hosts in the left pull-down menu displays aggregate GPU memory usage in percent per host. Selecting an individual host displays per-GPU memory usage for that host.

You may select the host(s) you want to view and also adjust the time period for which results appear using the pull-down menus at the right side of the Load tab. The available options are:

Last Hour (default)
6 Hours
Day
Week

Services Tab

The Services Status tab displays the status of services for each host being used for Kubernetes tenants.

Kubernetes Dashboard Services Status tab

This tab displays information such as (but not necessarily limited to) the following for each host in the deployment:

Host Name: Name of the host.
BD Agent: Status of the management service, which handles back-end administration tasks.
Monitoring Collector: Status of the monitoring engine that collects performance, usage, and other metrics.
Disk Pressure: Whether the available disk space and inodes on either the node's root filesystem or image filesystem has satisfied an eviction threshold.
Containerd Daemon: Status of the containerd daemon, which creates and manages containers.
Kube API Server: Status of the Kubernetes API server.
Kube Controller: Status of the Kubernetes controller host.
Kube Proxy: Status of the Kubernetes proxy.
Kube Scheduler: Status of the control plane Kubernetes scheduler.
Kubelet: Maintains the pods that are running inside each host.
Memory Pressure: Whether the available host memory has satisfied an eviction threshold.
Network: Kubernetes network status.
FileServer: File server status of the integrated persistent storage.
MountPoint: Mount point status of the integrated persistent storage.
PosixClient: Status of the POSIX Client of the integrated persistent storage.
Warden: Warden status.

The status of a service can be either OK (green dot), CRITICAL (red dot), or DISABLED (intentionally not running; gray dot). Hovering the mouse over the status button opens a popup with additional information. In general:

The Master host must not display any red dots. If the Master host has one or more error(s), then the Kubernetes cluster may not function properly.
If all of the dots for a Worker host are red, then that host will not be able to provide resources to the cluster. This situation typically occurs because the host has been powered off, has lost network connectivity, or because HPE Ezmeral Runtime Enterprise is not properly installed.
A Worker host with some red and some green dots may cause some Kubernetes cluster operations to fail, unless the errors are transient conditions caused by the host powering on or regaining network connectivity.

Please generate a support bundle and then contact HPE Technical Support if a host that is reporting service errors meets all of the following criteria:

HPE Ezmeral Runtime Enterprise is completely installed.
The host is powered on.
The host has network connectivity.

See The Support/Troubleshooting Screen and Generating a Support Bundle.

Alerts Tab

The Alerts tab displays any alert messages from the Caching Node, Data Server, and Management services.

The following alerts appear in this tab:

Notifications: Routine messages. A green dot appears next to each routine notification.
Error: A minor error has occurred. A gray dot appears next to each error notification.
Warning: A serious error has occurred. An orange dot appears next to each warning notification.
Critical: A critical error has occurred. A red dot appears next to each critical notification.

NOTE

The presence of non-routine alerts does not mean that HPE Ezmeral Runtime Enterprise will not function normally.

HPE Ezmeral Runtime Enterprise 5.6 Documentation
Abstract	HPE Ezmeral Container Platform is a unified container platform built on open source Kubernetes and designed for both cloud-native applications and non-cloud-native applications running on any infrastructure either on-premises, in multiple public clouds, in a hybrid model, or at the edge.
Published	July 2024
Edition	5.6.0