Configure Spark with the NodeManager Local Directory Set to file system

About this task

This procedure configures Spark to use the mounted NFS directory instead of the /tmp directory on the local file system. Note that spill to disk should be configured to spill to the file system node local storage only if local disks are unavailable or space is limited on those disks.

Procedure

  1. Install the mapr-loopbacknfs and nfs-utils packages if they are not already installed. For reference, see Installing the mapr-loopbacknfs Package and Setting Up Data Fabric NFS.
  2. Start the mapr-loopbacknfs service by following the steps at Managing the mapr-loopbacknfs Service.
  3. To configure Spark Shuffle on NFS, complete these steps on all nodes:
    1. Create a local volume for Spark Shuffle:
      sudo -u mapr maprcli volume create -name mapr.$(hostname -f).local.spark -path /var/mapr/local/$(hostname -f)/spark -replication 1 -localvolumehost $(hostname -f)
    2. Point the NodeManager local directory to the Spark Shuffle volume mounted through NFS by setting the following property in the /opt/mapr/hadoop/hadoop-<version>/etc/hadoop/yarn-site.xml file on the NodeManager nodes:
      <property>
          <name>yarn.nodemanager.local-dirs</name>
          <value>/mapr/my.cluster.com/var/mapr/local/${mapr.host}/spark</value>
      </property>
      
    3. (Optional) Configure how many times the NodeManager can attempt to delete application-related directories from a volume when Spark is configured to use the mounted NFS directory instead of the /tmp directory on the local file system. Increasing the value (default is 2) of this property can prevent application cache data from accumulating in the volume. This functionality is available by default starting in EEP 7.1.0. For previous EEP versions, request the patch. See Applying a Patch.
      <property>
          <name>yarn.nodemanager.max-retry-file-delete</name>
          <value>2</value>
      </property>
    4. Restart the NodeManager service and the Resource Manager service on the main node to pick up the yarn-site.xml changes:
      maprcli node services -name nodemanager -action restart -nodes <node 1> <node 2> <node 3>
      maprcli node services -name resourcemanager -action restart -nodes <node 1> <node 2> <node 3>