Configuring a Multitenant Cluster
Drill operations are memory and CPU-intensive. Currently, Drill resources are managed
outside of any cluster management service, such as the
Warden service. In a multi-tenant or any other type of cluster, YARN-enabled or not, you
configure memory and memory usage limits for Drill by modifying
drill-env.sh
as described in the section, "Configuring Drill Memory" in Apache Drill documentation.
drill-env.sh
allocates
resources for Drill to use during query execution, while configuring the following
properties in warden-drill-bits.conf
prevents warden from committing
the resources to other processes.
service.heapsize.min=<some value in MB>
service.heapsize.max=<some value in MB>
service.heapsize.percent=<a whole number>
Set the service.heapsize
properties in
warden.drill-bits.conf
regardless of whether you changed defaults in
drill-env.sh
or not.
"Configuring Drill in a YARN-enabled Cluster" shows an example of setting the
service.heapsize
properties. The
service.heapsize.percent
is the percentage of memory for the service
bounded by minimum and maximum values. Typically, users change
service.heapsize.percent
because using a percentage setting
increases or decreases resources according to different node configurations. For more
information about the service.heapsize
properties, see the section,
"warden.<servicename>.conf."
/opt/mapr/conf/conf.d:
warden.drill-bits.conf
warden.nodemanager.conf
warden.resourcemanager.conf
Configure Drill memory by modifying warden.drill-bits.conf
in YARN and
non-YARN clusters. Configure other resources by modifying
warden.nodemanager.conf
and
warden.resourcemanager.conf
in a YARN-enabled cluster.
Configuring Drill in a YARN-enabled Cluster
To add Drill to a YARN-enabled cluster, change memory resources to suit your application. For example, you have 120G of available memory that you allocate to following workloads in a Yarn-enabled cluster:
File system = 20G Yarn = 20G OS = 8G
If Yarn does most of the work, give Drill 20G, for example, and give Yarn 60G. If you expect a heavy query load, give Drill 60G and Yarn 20G.
YARN consists of two main services:
- ResourceManager: There is at least one instance in a cluster, more if you configure high availability.
- NodeManager: There is one instance per node.
warden.resourcemanager.conf
and
warden.nodemanager.conf
files set ResourceManager and NodeManager
memory to the following
defaults:service.heapsize.min=64
service.heapsize.max=325
service.heapsize.percent=2
/opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop/yarn-env.sh
You do not
set the -Xmx
option, allowing memory to grow as needed.MapReduce Version 2 and other Resources
warden.conf
.service.command.<servicename>.heapsize.percent
service.command.<servicename>.heapsize.max
service.command.<servicename>.heapsize.min
Configure memory for other services in the same manner. For more information about managing memory in a cluster, see the following sections:
How to Manage Drill CPU Resources
Currently, you do not manage CPU resources within Drill. Use Linux cgroups to manage the CPU resources.