Configure Scratch Directory for Spark Standalone

By default, Spark uses the /tmp directory as scratch space. Map output files and RDDs are stored in the scratch directory. To use a different directory, or a comma-separated list of multiple directories, set SPARK_LOCAL_DIRS to the path to the new directory by adding the following line to the $SPARK_HOME/conf/spark-env.sh file:

export SPARK_LOCAL_DIRS=$SPARK_HOME/<path to scratch directory>

Make this change before starting the Spark services.

Community Edition (Without NFS Support)

Reserve space on your local disk to use as the scratch directory for Spark.

Enterprise Edition and Enterprise Database Edition (With NFS Support)

Create a local volume on each node with the maprcli volume create command, or from the Control System. Mount that local volume with NFS to a directory. Set that directory as the scratch directory for Spark.
NOTE
Due to https://issues.apache.org/jira/browse/SPARK-6313, make sure to set spark.files.useFetchCache=false in your spark-defaults.conf file.