Configure Data Fabric Client Node to Run Spark Applications
When Spark runs on YARN, Data Fabric client nodes require the
hadoop-yarn-server-web-proxy
JAR file to run Spark applications. On
Windows, the client node also requires an update to the SPARK_DIST_CLASSPATH. A Data Fabric client node (a node with the mapr-client
package, but without mapr-core
packages) is also known as an edge node.
The
mapr-client
package does not include the JAR file required to run Spark
applications. Therefore, you must copy the following JAR file from a Data Fabric cluster node to the same location on the Data Fabric client node where you want to run the Spark
application:/opt/mapr/hadoop/hadoop-<version>/share/hadoop/yarn/hadoop-yarn-server-web-proxy-<version>.jar
For example, here is a JAR file path for Hadoop
3.3.5:
/opt/mapr/hadoop/hadoop-3.3.5/share/hadoop/yarn/hadoop-yarn-server-web-proxy-3.3.5.100-eep-920.jar