GPU Scheduling Workload Scenarios
Describes GPU scheduling workload scenarios and the notebook example for GPU idle reclaim.
In HPE Ezmeral Unified Analytics Software, you can encounter the following GPU scheduling workload scenarios during the GPU idle reclamation.
GPU Idle Reclaim
In HPE Ezmeral Unified Analytics Software, consider two GPU workloads, denoted as
Workload1 and Workload2. Currently,
Workload1 is running and is in an idle state while
Workload2 is pending due to lack of available GPU resources. In this
scenario, if the idle duration of Workload1 exceeds an idle time threshold,
Workload1 is preempted in favor of Workload2. Following
the preemption, Workload1 goes into a pending state, while
Workload2 is allocated GPU resources and starts running.
Active GPU Usage
In HPE Ezmeral Unified Analytics Software, consider two GPU workloads, denoted as
Workload1 and Workload2. Currently,
Workload1 is running and is using GPU resources while
Workload2 is pending due to lack of available GPU resources. The custom
scheduler runs a cron job every 5-10 minutes to determine the eligibility of reclaiming pods
based on their GPU usage and the annotation values set in the priority class attached to the
pod.
If the GPU usage for Workload1 is greater than 0.0,
Workload1 cannot be preempted in favor of Workload2. In
this scenario, Workload1 will continue to run and utilize the GPU resources
without interruption.
If the GPU usage for Workload1 is equal to 0.0 and if the idle duration of
Workload1 exceeds an idle time threshold, Workload1 is
preempted in favor of Workload2. Following the preemption,
Workload1 goes into a pending state, while Workload2 is
allocated GPU resources and starts running.
Priority Scheduling
In HPE Ezmeral Unified Analytics Software, consider three GPU workloads, denoted as
Workload1, Workload2, and Workload3.
Currently, Workload1 is running and is in an idle state,
Workload2 is pending due to lack of available GPU resources, and
Workload3 has the highest priority among the three workloads and is
pending due to lack of available GPU resources. In this scenario, if the idle duration of
Workload1 exceeds an idle time threshold, Workload1 is
preempted in favor of Workload3. Following the preemption,
Workload1 goes into a pending state, Workload3 is
allocated GPU resources and starts running, and Workload2 will continue to
be in the pending state.
Notebook Example for GPU Idle Reclaim
Consider a scenario in which HPE Ezmeral Unified Analytics Software is configured with a single physical GPU. In this scenario, you have chosen the small vGPU size, which includes 7 vGPUs. Each application will always have a maximum of one vGPU assigned to it.
idle-gpu-notebook,
used-gpu-notebook-1, used-gpu-notebook-2,
used-gpu-notebook-3, used-gpu-notebook-4,
used-gpu-notebook-5, and used-gpu-notebook-6. In this
scenario, the idle-notebook-gpu notebook server has an idle GPU with no GPU
usage while the six other notebook servers are actively using GPU resources.Idle status and the six others have a Running status.
test-idle-notebook-2. As the GPU usage for
idle-gpu-notebook is equal to 0.0, as soon as the idle duration of
idle-gpu-notebook exceeds an idle time threshold,
idle-gpu-notebook is preempted in favor of
test-idle-notebook-2. Following the preemption,
idle-gpu-notebook goes into a pending state, while
test-idle-notebook-2 is allocated GPU resources and starts running.