Enabling GPU Support on Kubeflow Kserve Model Serving

Describes how to enable GPU support on Kubeflow Kserve model serving instance.

Prerequisites

  • Sign in to HPE Ezmeral Unified Analytics Software.
  • Train and save a model using the PyTorch CUDA or Tensorflow CUDA libraries.

About this task

To enable GPU support for Kubeflow Kserve model serving instance in HPE Ezmeral Unified Analytics Software, follow these steps:

Procedure

  1. Click the Tools & Frameworks icon on the left navigation bar. Navigate to the Kubeflow tile under the Data Science tab and click Open.
  2. Click Endpoints on the left side menubar of the Kubeflow Central Dashboard.
  3. Click the + New Endpoint button or click on your saved model.
  4. Create or update the InferenceService yaml manifest and set storageURI and the corresponding type of predictor (tensorflow or pytorch).
  5. To enable GPU, set the resources.limits section of the yaml as follows:
    For example:
     apiVersion: "serving.kserve.io/v1beta1"
    kind: "InferenceService"
    metadata:
      name: "tensorflow-gpu"
      namespace: "<user-name>"
    spec:
      predictor:
        serviceAccountName: <service-account-name>
        tensorflow:
          storageUri: "s3://mlflow/4/4d60878e34a947b080a6015ae297aaca/artifacts"
          resources:
            limits:
              nvidia.com/gpu: 1
    
    NOTE
    With MIG configuration, only one GPU can be assigned per application. For details, see GPU Support.

Results

The GPU is now enabled on Kubeflow Kserve model serving instance.