Integrate Hue With Spark

In the [spark] section of the hue.ini file, set the livy_server_url parameters to the host and port where the Livy server is running:

[spark]
  # The Livy Server URL.
  livy_server_url=https://node10.cluster.com:8998

To configure Hue to use Spark modes, modify livy.conf (

vim
                        /opt/mapr/livy/livy-<version>/conf/livy.conf

):

If Spark jobs run on local mode, set the livy.spark.master property:

…
# What spark master Livy sessions should use.
livy.spark.master = local[*]
….

If Spark jobs run on YARN mode, set the livy.spark.master and


                                livy.spark.deployMode

properties (client or cluster). For example:

….
# What spark master Livy sessions should use.
livy.spark.master = yarn
# What spark deploy mode Livy sessions should use.
livy.spark.deployMode = cluster
….

If Spark jobs run on Standalone mode, set the livy.spark.master property. For example:

# What spark master Livy sessions should use.
livy.spark.master = spark://ubuntu500:7077

If Spark jobs run on Mesos mode, set the livy.spark.master property. For example:
```
# What spark master Livy sessions should use.
livy.spark.master = mesos://<mesos-master-node-ip>:5050 
```
NOTE
Integration of Spark on Mesos with Hue is not supported in cluster deployment mode.

If you want to be able to access Hive through Spark in Hue, configure Spark with Hive, and set livy.repl.enableHiveContext to true in livy.conf. For example:

...
# Whether to enable HiveContext in livy interpreter, if it is true hive-site.xml will be detected
# on user request and then livy server classpath automatically.
livy.repl.enableHiveContext = true
...

If you plan to use PySpark, you must set the PYTHONPATH environment variable in livy-env.sh (/opt/mapr/livy/livy-<version>/conf/livy-env.sh):

...
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-<version>-
src.zip:$SPARK_HOME/python/:$PYTHONPATH

For example:

...
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.10.7-
src.zip:$SPARK_HOME/python/:$PYTHONPATH

Ensure that R is installed on the node if you plan to run SparkR. To install R to run SparkR jobs:

On Ubuntu

sudo apt-get install r-base

On Red Hat / Rocky

sudo yum install R

Restart the Spark REST Job Server (Livy).

maprcli node services -name livy -action restart -nodes <livy node>

Restart Hue:

maprcli node services -name hue -action restart -nodes <hue node>

Integrate Hue With Spark

About this task

Procedure

HPE Ezmeral Data Fabric – Customer-Managed 7.9.0 Documentation
Abstract	This site contains documentation for the customer-managed platform of the HPE Ezmeral Data Fabric version 7.9.0 including installation, configuration, administration, and reference content, as well as content for the associated bundled ecosystem components and drivers.
Published	April 2025
Edition	7.9.0
Topic last updated	2024-05-28