Enabling GPU Support for Livy Sessions

Describes how to enable and allocate GPU resources on Livy Server.

Enabling GPU Support for Livy Sessions Created Using Spark Interactive Sessions

To enable GPU processing and allocate GPU resources when using Spark interactive sessions, follow these steps:
  1. Perform the creating interactive sessions instructions until you reach the Spark Configurations box in the Session Configurations and Dependencies step. See Creating Interactive Sessions.
  2. Set the Spark Configurations for GPU by providing key-value pairs. To add each Spark configurations required to run your session, click Add Configuration.

  3. To specify the details for other boxes or options in the Session Configurations and Dependencies step and to complete creating interactive sessions, see Creating Interactive Sessions.

Enabling GPU Support for Livy Sessions Created Using Notebooks

To enable GPU processing and allocate GPU resources when using Spark magic (%manage_spark) to create Livy sessions, follow these steps:
  1. Run %manage_spark to connect to the Livy server and start a new session. See %manage_spark for details.
  2. Run %config_spark to add the Spark configurations.
  3. Click the +Add Spark Configuration Key-Value Pair button.
  4. Enter the key and value for Spark Configurations for GPU in their respective boxes.
  5. After you have finished adding the key-value pairs, click Submit. This will save the new Spark configuration changes to enable the GPU support for Livy sessions.
  6. To specify the details for the other boxes or options in the Create Session step and to complete creating Livy session, see %manage_spark.

Verifying Livy Sessions are Running on GPU

To verify Livy sessions are running on GPU, you can use the explain Spark method.

Run the following PySpark application for Livy Sessions Created Using Spark Interactive Sessions:
sqlContext = SQLContext(sc)
 
df = sqlContext.createDataFrame([1,2,3], "int").toDF("value")
df.createOrReplaceTempView("df")
 
sqlContext.sql("SELECT * FROM df WHERE value<>1").explain()
sqlContext.sql("SELECT * FROM df WHERE value<>1").show()
Run the following PySpark application for Livy Sessions Created Using Notebooks:
from pyspark.sql import SQLContext

from py4j.java_gateway import java_import
jvm = sc._jvm
java_import(jvm, "org.apache.spark.sql.api.python.*")

sqlContext = SQLContext(sc)

df = sqlContext.createDataFrame([1,2,3], "int").toDF("value")
df.createOrReplaceTempView("df")

sqlContext.sql("SELECT * FROM df WHERE value<>1").explain()
sqlContext.sql("SELECT * FROM df WHERE value<>1").show()
If you get the following output where the explain method prints the GPU-related stages, you can verify that your Livy session is running on GPU.
== Physical Plan ==
GpuColumnarToRow false
+- GpuFilter NOT (value#2 = 1), true
   +- GpuRowToColumnar targetsize(2147483647)
      +- *(1) SerializeFromObject [input[0, int, false] AS value#2]
         +- Scan[obj#1]
However, if you get the following output, your Livy session is not running on GPU but instead on CPU. You must ensure that Livy sessions are configured properly to work on GPU.
== Physical Plan ==
*(1) Filter NOT (value#2 = 1)
+- *(1) SerializeFromObject [input[0, int, false] AS value#2]
   +- Scan[obj#1]