Configuring a Multitenant Cluster
Drill operations are memory and CPU-intensive. Currently, Drill resources are managed
outside of any cluster management service, such as the
Warden service. In a multi-tenant or any other type of cluster, YARN-enabled or not, you
configure memory and memory usage limits for Drill by modifying
drill-env.sh as described in the section, "Configuring Drill Memory" in Apache Drill documentation.
drill-env.sh allocates
resources for Drill to use during query execution, while configuring the following
properties in warden-drill-bits.conf prevents warden from committing
the resources to other processes.
service.heapsize.min=<some value in MB>
service.heapsize.max=<some value in MB>
service.heapsize.percent=<a whole number>Set the service.heapsize properties in
warden.drill-bits.conf regardless of whether you changed defaults in
drill-env.sh or not.
"Configuring Drill in a YARN-enabled Cluster" shows an example of setting the
service.heapsize properties. The
service.heapsize.percent is the percentage of memory for the service
bounded by minimum and maximum values. Typically, users change
service.heapsize.percent because using a percentage setting
increases or decreases resources according to different node configurations. For more
information about the service.heapsize properties, see the section,
"warden.<servicename>.conf."
/opt/mapr/conf/conf.d:
warden.drill-bits.conf
warden.nodemanager.conf
warden.resourcemanager.confConfigure Drill memory by modifying warden.drill-bits.conf in YARN and
non-YARN clusters. Configure other resources by modifying
warden.nodemanager.conf and
warden.resourcemanager.conf in a YARN-enabled cluster.
Configuring Drill in a YARN-enabled Cluster
To add Drill to a YARN-enabled cluster, change memory resources to suit your application. For example, you have 120G of available memory that you allocate to following workloads in a Yarn-enabled cluster:
File system = 20G Yarn = 20G OS = 8G
If Yarn does most of the work, give Drill 20G, for example, and give Yarn 60G. If you expect a heavy query load, give Drill 60G and Yarn 20G.
YARN consists of two main services:
- ResourceManager: There is at least one instance in a cluster, more if you configure high availability.
- NodeManager: There is one instance per node.
warden.resourcemanager.conf and
warden.nodemanager.conf files set ResourceManager and NodeManager
memory to the following
defaults:service.heapsize.min=64
service.heapsize.max=325
service.heapsize.percent=2 /opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop/yarn-env.shYou do not
set the -Xmx option, allowing memory to grow as needed.MapReduce Version 2 and other Resources
warden.conf.service.command.<servicename>.heapsize.percent
service.command.<servicename>.heapsize.max
service.command.<servicename>.heapsize.min Configure memory for other services in the same manner. For more information about managing memory in a cluster, see the following sections:
How to Manage Drill CPU Resources
Currently, you do not manage CPU resources within Drill. Use Linux cgroups to manage the CPU resources.