Integrate Hue With Spark

About this task

IMPORTANT

Hue integration with Spark is an experimental feature.

Procedure

In the [spark] section of the hue.ini file, set the livy_server_url parameters to the host and port where the Livy server is running:
```
[spark]
  # The Livy Server URL.
  livy_server_url=https://node10.cluster.com:8998
```
To configure Hue to use Spark modes, modify livy.conf (vim /opt/mapr/livy/livy-<version>/conf/livy.conf):
1. If Spark jobs run on local mode, set the livy.spark.master property:
```
…
# What spark master Livy sessions should use.
livy.spark.master = local[*]
….
```
2. If Spark jobs run on YARN mode, set the livy.spark.master andlivy.spark.deployMode properties (client or cluster). For example:
```
….
# What spark master Livy sessions should use.
livy.spark.master = yarn
# What spark deploy mode Livy sessions should use.
livy.spark.deployMode = cluster
….
```
3. If Spark jobs run on Standalone mode, set the livy.spark.master property. For example:
```
# What spark master Livy sessions should use.
livy.spark.master = spark://ubuntu500:7077
```
4. If Spark jobs run on Mesos mode, set the livy.spark.master property. For example:
```
# What spark master Livy sessions should use.
livy.spark.master = mesos://<mesos-master-node-ip>:5050 
```
  NOTE
  Integration of Spark on Mesos with Hue is not supported in cluster deployment mode.

If you want to be able to access Hive through Spark in Hue, configure Spark with Hive, and set livy.repl.enableHiveContext to true in livy.conf. For example:

...
# Whether to enable HiveContext in livy interpreter, if it is true hive-site.xml will be detected
# on user request and then livy server classpath automatically.
livy.repl.enableHiveContext = true
...

If you plan to use PySpark, you must set the PYTHONPATH environment variable in livy-env.sh (/opt/mapr/livy/livy-<version>/conf/livy-env.sh):

...
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-<version>-
src.zip:$SPARK_HOME/python/:$PYTHONPATH

For example:

...
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.10.7-
src.zip:$SPARK_HOME/python/:$PYTHONPATH

Ensure that R is installed on the node if you plan to run SparkR. To install R to run SparkR jobs:
On Ubuntu
```
sudo apt-get install r-base
```
On Red Hat / Rocky
```
sudo yum install R
```

Restart the Spark REST Job Server (Livy).

maprcli node services -name livy -action restart -nodes <livy node>

Restart Hue:

maprcli node services -name hue -action restart -nodes <hue node>

HPE Data Fabric 8.0.0 Software Documentation
Abstract	This site contains documentation for HPE Data Fabric Software version 8.0.0 including installation, configuration, administration, and reference content, as well as content for the associated bundled ecosystem components and drivers.
Published	July 2026
Edition	8.0.0
Topic last updated	2024-05-28