GPU Scheduling Workload Scenarios
Describes GPU scheduling workload scenarios and the notebook example for GPU idle reclaim.
In HPE Ezmeral Unified Analytics Software, you can encounter the following GPU scheduling workload scenarios during the GPU idle reclamation.
GPU Idle Reclaim
In HPE Ezmeral Unified Analytics Software, consider two GPU workloads, denoted as
Workload1
and Workload2
. Currently,
Workload1
is running and is in an idle state while
Workload2
is pending due to lack of available GPU resources. In this
scenario, if the idle duration of Workload1
exceeds an idle time threshold,
Workload1
is preempted in favor of Workload2
. Following
the preemption, Workload1
goes into a pending state, while
Workload2
is allocated GPU resources and starts running.
Active GPU Usage
In HPE Ezmeral Unified Analytics Software, consider two GPU workloads, denoted as
Workload1
and Workload2
. Currently,
Workload1
is running and is using GPU resources while
Workload2
is pending due to lack of available GPU resources. The custom
scheduler runs a cron job every 5-10 minutes to determine the eligibility of reclaiming pods
based on their GPU usage and the annotation values set in the priority class attached to the
pod.
If the GPU usage for Workload1
is greater than 0.0,
Workload1
cannot be preempted in favor of Workload2
. In
this scenario, Workload1
will continue to run and utilize the GPU resources
without interruption.
If the GPU usage for Workload1
is equal to 0.0 and if the idle duration of
Workload1
exceeds an idle time threshold, Workload1
is
preempted in favor of Workload2
. Following the preemption,
Workload1
goes into a pending state, while Workload2
is
allocated GPU resources and starts running.
Priority Scheduling
In HPE Ezmeral Unified Analytics Software, consider three GPU workloads, denoted as
Workload1
, Workload2
, and Workload3
.
Currently, Workload1
is running and is in an idle state,
Workload2
is pending due to lack of available GPU resources, and
Workload3
has the highest priority among the three workloads and is
pending due to lack of available GPU resources. In this scenario, if the idle duration of
Workload1
exceeds an idle time threshold, Workload1
is
preempted in favor of Workload3
. Following the preemption,
Workload1
goes into a pending state, Workload3
is
allocated GPU resources and starts running, and Workload2
will continue to
be in the pending state.
Notebook Example for GPU Idle Reclaim
Consider a scenario in which HPE Ezmeral Unified Analytics Software is configured with a single physical GPU. In this scenario, you have chosen the small vGPU size, which includes 7 vGPUs. Each application will always have a maximum of one vGPU assigned to it.
idle-gpu-notebook
,
used-gpu-notebook-1
, used-gpu-notebook-2
,
used-gpu-notebook-3
, used-gpu-notebook-4
,
used-gpu-notebook-5
, and used-gpu-notebook-6
. In this
scenario, the idle-notebook-gpu
notebook server has an idle GPU with no GPU
usage while the six other notebook servers are actively using GPU resources.Idle
status and the six others have a Running
status.
test-idle-notebook-2
. As the GPU usage for
idle-gpu-notebook
is equal to 0.0, as soon as the idle duration of
idle-gpu-notebook
exceeds an idle time threshold,
idle-gpu-notebook
is preempted in favor of
test-idle-notebook-2
. Following the preemption,
idle-gpu-notebook
goes into a pending state, while
test-idle-notebook-2
is allocated GPU resources and starts running.