Running Ray Matrix Multiplication Application

Provides an end-to-end example for creating a notebook server and submitting a matrix multiplication application job in local and distributed setting using Ray in HPE Ezmeral Unified Analytics Software.

Prerequisites

  • Sign in to HPE Ezmeral Unified Analytics Software.
  • Verify that the installed Ray client and server versions match. To verify, complete the following steps in the terminal:
    1. To switch to Ray's environment, run:
      source /opt/conda/etc/profile.d/conda.sh && conda activate ray
    2. To verify that the Ray client and server versions match, run :
      ray --version

About this task

In this tutorial, you will:

  1. Submit the regular Python functions as the Ray tasks using JobSubmissionClient to utilize Ray's distributed computing capabilities.
  2. Generate two random matrices and multiply the generated matrices locally and using Ray utilizing the NumPy package.
  3. Record the duration for matrix generation and multiplication to observe Ray’s efficiency under heavy workloads.

Procedure

  1. Create a notebook server using the jupyter-data-science image with at least 3 CPUs and 4 Gi of memory in Kubeflow. See Creating and Managing Notebook Servers.

  2. In your notebook environment, activate the Ray-specific Python kernel.
  3. To ensure optimal performance, use dedicated directories containing only the essential files needed for that job submission as a working directory.

    For example, if you do not see the Matrix_Multiplication folder in the <username> directory, copy the folder from the shared/ezua-tutorials/current-release/Data-Science/Ray/Ray-CPU directory into the <username> directory. The shared directory is accessible to all users. Editing or running examples from the shared directory is not advised. The <username> directory is specific to you and cannot be accessed by other users.

  4. Open the ray-matrix_multiplication-executor.ipynb file in the <username>/Matrix_Multiplication directory.
  5. Select the first cell of the ray-matrix_multiplication-executor.ipynb notebook and click Run the selected cells and advance (play icon). Continue until you run all cells.

Results

After running the final block of code, you will get the following output:

Matrix multiplication runtime for local submission is 39.76 seconds.

Matrix multiplication runtime for Ray submission is 25.66 seconds.

The performance of the Ray job submission is better than that of the local job submission.