Step 3: Configure YARN to Run Drill
YARN default settings are optimized for MapReduce applications. MapReduce applications use a
limited amount of memory; however, Drill is long-running and consumes a significant amount of
resources. Adjust the YARN memory configuration to allow YARN to allocate containers large
enough to run Drill. Exclude the YARN container directory from
systemd-tmpfiles
to prevent systemd-tmpfiles
from removing
Drill’s container files while Drill runs.
Increase Maximum Container Size
drill-on-yarn.conf
, which you previously set in step
2:drillbit: {
memory-mb: 14336
}
Use this number to set the yarn.scheduler.maximum-allocation-mb
parameter in
/opt/mapr/hadoop/hadoop-<version>/etc/hadoop.<version>
,
substituting the number of the version you have installed.
yarn-site.xml
to add the
following:<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>14336</value>
<description>Set to allow Drill containers 14G.</description>
</property>
Restart the YARN Resource Manager to pick up change, and use the YARN Resource Manager UI to verify that the maximum container size shows the new value.
Exclude the YARN Container Directory from systemd-tmpfiles
The system puts the YARN Node Manager container files in the /tmp
directory.
Most system administrators configure systemd-tmpfiles
to periodically
remove files in /tmp
. Since Drill-on-YARN is a long-running YARN
application, systemd-tmpfiles
can remove Drill’s container files while
Drill runs. If this occurs, you must manually shut down the Drill cluster because
systemd-tmpfiles
will have removed the pid
file that
YARN needs to manage Drill.
systemd-tmpfiles
from cleaning up Drill's container files by
adding a new configuration file to /etc/tmpfiles.d/
, for example
/etc/tmpfiles.d/exclude-nm-local-dir.conf
, with the following
configuration:x /tmp/hadoop-mapr/nm-local-dir/*
systemd-tmpfiles
from cleaning the
/nm-local-dir
directory when cleaning /tmp
.$ cat /etc/tmpfiles.d/exclude-nm-local-dir.conf
x /tmp/hadoop-mapr/nm-local-dir/*