Enabling GPU Support on Kubeflow Kserve Model Serving

Describes how to enable GPU support on Kubeflow Kserve model serving instance.

Prerequisites

Sign in to HPE Ezmeral Unified Analytics Software.
Train and save a model using the PyTorch CUDA or Tensorflow CUDA libraries.

About this task

To enable GPU support for Kubeflow Kserve model serving instance in HPE Ezmeral Unified Analytics Software, follow these steps:

Procedure

Click the Tools & Frameworks icon on the left navigation bar. Navigate to the Kubeflow tile under the Data Science tab and click Open.
Click Endpoints on the left side menubar of the Kubeflow Central Dashboard.
Click the + New Endpoint button or click on your saved model.
Create or update the InferenceService yaml manifest and set storageURI and the corresponding type of predictor (tensorflow or pytorch).

To enable GPU, set the resources.limits section of the yaml as follows:

For example:

 apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "tensorflow-gpu"
  namespace: "<user-name>"
spec:
  predictor:
    serviceAccountName: <service-account-name>
    tensorflow:
      storageUri: "s3://mlflow/4/4d60878e34a947b080a6015ae297aaca/artifacts"
      resources:
        limits:
          nvidia.com/gpu: 1

NOTE

With MIG configuration, only one GPU can be assigned per application. For details, see GPU Support.

Results

The GPU is now enabled on Kubeflow Kserve model serving instance.

HPE Ezmeral Unified Analytics Software 1.5 Documentation
Abstract	HPE Ezmeral Unified Analytics Software is a usage-based Software-as-a-Service (SaaS) model that operationalizes hybrid and multi-cloud modern analytical workloads through a simple user interface, easily installed and deployed in minutes. HPE Ezmeral Unified Analytics Software separates compute and storage for flexible, cost-efficient scalability to securely access data stored in multiple data platforms, enabling you to run traditional and advanced analytics workloads with open-source tools.
Published	July 2025
Edition	1.5.0
Topic last updated	2024-08-30