Enabling GPU Support for Livy Sessions
Describes how to enable and allocate GPU resources on Livy Server.
Enabling GPU Support for Livy Sessions Created Using Spark Interactive Sessions
To enable GPU processing and allocate GPU resources when using Spark interactive sessions,
follow these steps:
- Perform the creating interactive sessions instructions until you reach the Spark Configurations box in the Session Configurations and Dependencies step. See Creating Interactive Sessions.
- Set the Spark Configurations for GPU by providing key-value pairs.
To add each Spark configurations required to run your session, click Add
Configuration.
-
To specify the details for other boxes or options in the Session Configurations and Dependencies step and to complete creating interactive sessions, see Creating Interactive Sessions.
Enabling GPU Support for Livy Sessions Created Using Notebooks
To enable GPU processing and allocate GPU resources when using Spark magic
(
%manage_spark
) to create Livy sessions, follow these steps:- Run
%manage_spark
to connect to the Livy server and start a new session. See %manage_spark for details. - Run
%config_spark
to add the Spark configurations. - Click the +Add Spark Configuration Key-Value Pair button.
- Enter the key and value for Spark Configurations for GPU in their respective boxes.
- After you have finished adding the key-value pairs, click Submit. This will save the new Spark configuration changes to enable the GPU support for Livy sessions.
- To specify the details for the other boxes or options in the Create Session step and to complete creating Livy session, see %manage_spark.
Verifying Livy Sessions are Running on GPU
To verify Livy sessions are running on GPU, you can use the explain Spark method.
Run the following PySpark application for Livy Sessions Created Using Spark Interactive
Sessions:
sqlContext = SQLContext(sc)
df = sqlContext.createDataFrame([1,2,3], "int").toDF("value")
df.createOrReplaceTempView("df")
sqlContext.sql("SELECT * FROM df WHERE value<>1").explain()
sqlContext.sql("SELECT * FROM df WHERE value<>1").show()
Run the following PySpark application for Livy Sessions Created Using
Notebooks:
from pyspark.sql import SQLContext
from py4j.java_gateway import java_import
jvm = sc._jvm
java_import(jvm, "org.apache.spark.sql.api.python.*")
sqlContext = SQLContext(sc)
df = sqlContext.createDataFrame([1,2,3], "int").toDF("value")
df.createOrReplaceTempView("df")
sqlContext.sql("SELECT * FROM df WHERE value<>1").explain()
sqlContext.sql("SELECT * FROM df WHERE value<>1").show()
If you get the following output where the explain method prints the GPU-related stages, you
can verify that your Livy session is running on GPU.
== Physical Plan ==
GpuColumnarToRow false
+- GpuFilter NOT (value#2 = 1), true
+- GpuRowToColumnar targetsize(2147483647)
+- *(1) SerializeFromObject [input[0, int, false] AS value#2]
+- Scan[obj#1]
However, if you get the following output, your Livy session is not running on GPU but
instead on CPU. You must ensure that Livy sessions are configured properly to work on
GPU.
== Physical Plan ==
*(1) Filter NOT (value#2 = 1)
+- *(1) SerializeFromObject [input[0, int, false] AS value#2]
+- Scan[obj#1]