Tutorial: Katib Hyperparameter Tuning
Example 1: TensorFlow
To complete this tutorial:
- If you have not done so already, download the Kubeflow tutorials zip file, which contains sample files for all of the included Kubeflow tutorials.
-
Deploy the example file:
kubectl apply -f tensorflow-example.yaml
- Open the Kubeflow UI and nagivate to .
- Click the experiment name, and then observe the running trials.
-
Check the experiment status:
kubectl get experiment
-
Check the experiment trials:
kubectl get trial
Example 2: Random Algorithm
This example may take some time to finish, depending on the resources allocated.
The following hyperparameters can be tuned:
--lr
- learning rate--num-layers
- Number of layers in the neural networks-
--optimizer
To launch an experiment using the random algorithm example:
- If you have not done so already, download the Kubeflow tutorials zip file file, which contains sample files for all of the included Kubeflow tutorials.
-
Deploy the example file:
kubectl apply -f random-example.yaml
This example embeds the hyperparameters as arguments. You can embed hyperparameters
in another way (e.g. by using environment variables) by using the template defined
in the TrialTemplate.GoTemplate.RawTemplate
section of the
yaml
file. The template uses the Go
template format (link opens an external website in a new browser
tab/window).
This example randomly generates the following hyperparameters:
--lr
- Learning rate (type:double
).--num-layers
- Number of layers in the neural network (type:integer
).--optimizer
- Optimizer (type:categorical
).
Check the experiment status:
kubectl describe experiment random-example
Example 3: PyTorch
This example may take some time to finish, depending on the resources allocated.
- If you have not done so already, download the Kubeflow tutorials zip file file, which contains sample files for all of the included Kubeflow tutorials
-
Deploy the example file:
kubectl apply -f pytorch-example.yaml
- Open the Kubeflow UI and navigate to .
- Click the experiment name, and then observe the trials running.
-
Check the experiment status:
kubectl get experiment
-
Use the following command to check trials of the experiment:
kubectl get trial
Clean Up
- Random algorithm
example:
kubectl delete -f random-example.yaml
- Tensorflow
example:
kubectl delete -f tensorflow-example.yaml
- PyTorch
example:
kubectl delete -f pytorchjob-example.yaml
Sample Katib Commands
To check experiment results via the kubectl
CLI.
-
List experiments:
kubectl get experiment
NAME STATUS AGE random-experiment Succeeded 25m
-
Check experiment result
kubectl get experiment random-example -o yaml
-
List trials
kubectl get trials NAME STATUS AGE random-experiment-24lgqghm Succeeded 26m
-
Check trial detail
kubectl get trials random-experiment-24lgqghm -o yaml
To check the status using the interface:
- Go to the Kubeflow home page.
- Click the View Katib experiments button.
- Click the name of the experiment.
- Observe the built experiment graph after all the trials have Succeeded.