Enabling GPU Support on Kubeflow Kserve Model Serving
Describes how to enable GPU support on Kubeflow Kserve model serving instance.
Prerequisites
- Sign in to HPE Ezmeral Unified Analytics Software.
- Train and save a model using the PyTorch CUDA or Tensorflow CUDA libraries.
About this task
To enable GPU support for Kubeflow Kserve model serving instance in HPE Ezmeral Unified Analytics Software, follow these steps:
Procedure
- Click the Tools & Frameworks icon on the left navigation bar. Navigate to the Kubeflow tile under the Data Science tab and click Open.
- Click Endpoints on the left side menubar of the Kubeflow Central Dashboard.
- Click the + New Endpoint button or click on your saved model.
-
Create or update the
InferenceService
yaml manifest and setstorageURI
and the corresponding type ofpredictor
(tensorflow or pytorch). -
To enable GPU, set the
resources.limits
section of the yaml as follows:For example:apiVersion: "serving.kserve.io/v1beta1" kind: "InferenceService" metadata: name: "tensorflow-gpu" namespace: "<user-name>" spec: predictor: serviceAccountName: <service-account-name> tensorflow: storageUri: "s3://mlflow/4/4d60878e34a947b080a6015ae297aaca/artifacts" resources: limits: nvidia.com/gpu: 1
NOTEWith MIG configuration, only one GPU can be assigned per application. For details, see GPU Support.
Results
The GPU is now enabled on Kubeflow Kserve model serving instance.