Enabling GPU Support on Kubeflow Kserve Model Serving
Describes how to enable GPU support on Kubeflow Kserve model serving instance.
Prerequisites
- Sign in to HPE Ezmeral Unified Analytics Software.
- Train and save a model using the PyTorch CUDA or Tensorflow CUDA libraries.
About this task
To enable GPU support for Kubeflow Kserve model serving instance in HPE Ezmeral Unified Analytics Software, follow these steps:
Procedure
- Click the Tools & Frameworks icon on the left navigation bar. Navigate to the Kubeflow tile under the Data Science tab and click Open.
- Click Endpoints on the left side menubar of the Kubeflow Central Dashboard.
- Click the + New Endpoint button or click on your saved model.
-
Create or update the
InferenceServiceyaml manifest and setstorageURIand the corresponding type ofpredictor(tensorflow or pytorch). -
To enable GPU, set the
resources.limitssection of the yaml as follows:For example:apiVersion: "serving.kserve.io/v1beta1" kind: "InferenceService" metadata: name: "tensorflow-gpu" namespace: "<user-name>" spec: predictor: serviceAccountName: <service-account-name> tensorflow: storageUri: "s3://mlflow/4/4d60878e34a947b080a6015ae297aaca/artifacts" resources: limits: nvidia.com/gpu: 1NOTEWith MIG configuration, only one GPU can be assigned per application. For details, see GPU Support.
Results
The GPU is now enabled on Kubeflow Kserve model serving instance.