Integrate Hue With Spark
About this task
IMPORTANT
Hue integration with Spark is an experimental feature. Procedure
-
In the
[spark]
section of thehue.ini
file, set thelivy_server_url
parameters to the host and port where the Livy server is running:[spark] # The Livy Server URL. livy_server_url=https://node10.cluster.com:8998
-
To configure Hue to use Spark modes, modify
livy.conf
(vim /opt/mapr/livy/livy-<version>/conf/livy.conf
): -
If you want to be able to access Hive through Spark in Hue, configure Spark
with Hive, and set
livy.repl.enableHiveContext
totrue
inlivy.conf
. For example:... # Whether to enable HiveContext in livy interpreter, if it is true hive-site.xml will be detected # on user request and then livy server classpath automatically. livy.repl.enableHiveContext = true ...
-
If you plan to use PySpark, you must set the PYTHONPATH environment variable in
livy-env.sh
(/opt/mapr/livy/livy-<version>/conf/livy-env.sh
):... export PYTHONPATH=$SPARK_HOME/python/lib/py4j-<version>- src.zip:$SPARK_HOME/python/:$PYTHONPATH
For example:... export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.10.7- src.zip:$SPARK_HOME/python/:$PYTHONPATH
-
Ensure that R is installed on the node if you plan to run SparkR. To install R
to run SparkR jobs:
- On Ubuntu
-
sudo apt-get install r-base
- On Red Hat / Rocky
-
sudo yum install R
-
Restart the Spark REST Job Server (Livy).
maprcli node services -name livy -action restart -nodes <livy node>
-
Restart Hue:
maprcli node services -name hue -action restart -nodes <hue node>