Tutorial: Katib Hyperparameter Tuning

Example 1: TensorFlow

To complete this tutorial:

If you have not done so already, download the Kubeflow tutorials zip file, which contains sample files for all of the included Kubeflow tutorials.

Deploy the example file:

kubectl apply -f tensorflow-example.yaml

Open the Kubeflow UI and nagivate to Home > View Katib experiments.
Click the experiment name, and then observe the running trials.
Check the experiment status:
```
kubectl get experiment
```
Check the experiment trials:
```
kubectl get trial
```

Example 2: Random Algorithm

This example may take some time to finish, depending on the resources allocated.

The following hyperparameters can be tuned:

--lr - learning rate
--num-layers - Number of layers in the neural networks
--optimizer

To launch an experiment using the random algorithm example:

If you have not done so already, download the Kubeflow tutorials zip file file, which contains sample files for all of the included Kubeflow tutorials.
Deploy the example file:
```
kubectl apply -f random-example.yaml
```

This example embeds the hyperparameters as arguments. You can embed hyperparameters in another way (e.g. by using environment variables) by using the template defined in the TrialTemplate.GoTemplate.RawTemplate section of the yaml file. The template uses the Go template format (link opens an external website in a new browser tab/window).

This example randomly generates the following hyperparameters:

--lr - Learning rate (type: double).
--num-layers - Number of layers in the neural network (type: integer).
--optimizer - Optimizer (type: categorical).

Check the experiment status:

kubectl describe experiment random-example

Example 3: PyTorch

This example may take some time to finish, depending on the resources allocated.

If you have not done so already, download the Kubeflow tutorials zip file file, which contains sample files for all of the included Kubeflow tutorials
Deploy the example file:
```
kubectl apply -f pytorch-example.yaml
```
Open the Kubeflow UI and navigate to Home > View Katib experiments.
Click the experiment name, and then observe the trials running.
Check the experiment status:
```
kubectl get experiment
```
Use the following command to check trials of the experiment:
```
kubectl get trial
```

Clean Up

Delete the examples with the following commands:

Random algorithm example:
```
kubectl delete -f random-example.yaml
```
Tensorflow example:
```
kubectl delete -f tensorflow-example.yaml
```

PyTorch example:

kubectl delete -f pytorchjob-example.yaml

Sample Katib Commands

To check experiment results via the kubectl CLI.

List experiments:

kubectl get experiment

NAME                STATUS      AGE
random-experiment   Succeeded   25m

Check experiment result

kubectl get experiment random-example -o yaml

List trials

kubectl get trials

NAME                         STATUS      AGE
random-experiment-24lgqghm   Succeeded   26m

Check trial detail

kubectl get trials random-experiment-24lgqghm -o yaml

To check the status using the interface:

Go to the Kubeflow home page.
Click the View Katib experiments button.
Click the name of the experiment.
Observe the built experiment graph after all the trials have Succeeded.

HPE Ezmeral Runtime Enterprise 5.7 Documentation
Abstract	HPE Ezmeral Container Platform is a unified container platform built on open source Kubernetes and designed for both cloud-native applications and non-cloud-native applications running on any infrastructure either on-premises, in multiple public clouds, in a hybrid model, or at the edge.
Published	May 2025
Edition	5.7.0
Topic last updated	2022-10-10