Configuring GPU Idle Reclaim
Describes how to configure the GPU idle reclaim, view pod details, and view GPU usage.
You can view frameworks, the number of vGPUs assigned, framework status, priority level, and the idle time threshold in the GPU Control Panel screen. You can also view the pod details and the GPU utilization chart.
- Sign in to HPE Ezmeral Unified Analytics Software as Administrator.
- In the left navigation bar, click Administration → Resource Management.
In this screen, you can configure the policy settings, view the pod details and GPU usage as follows:
Configuring the Policy Settings
- Priority Level
-
Set the priority level in the range of 8000-10000 where 8000 is the lowest priority and 10000 is the highest priority. For example, a pod with the 8000 priority level will have a low priority compared to the pod with the 10000 priority level.
- Default priority level: 8000
WARNINGDo not modify priority settings when pods are in a pending state, as this causes pod failure. Pods must be in either a running or terminated state when you modify the priority settings. - Idle Time Threshold
-
Set the maximum amount of time a vGPU on a workload can be idle before that workload can be preempted (deallocated) automatically by a pending workload.
- Minimum idle time threshold: 60 seconds
- Default idle time threshold: 300 seconds
The new policy settings will not be applied to the pods that are currently in the Running or Idle status. These new policy settings will be applied to the new workloads.
Viewing the Pod Details
To view the pod details, click frameworks that are in the Idle or Running status. This will open a pod detail screen. Here, you can see a list of pods, vGPU assigned, status, age of pods, and the GPU utilization chart.