Configuring Drill

Lists the data-fabric-specific configuration for Drill.

Drill is highly configurable. This document focuses on data-fabric-related configurations and refers to the open source Apache Drill documentation for generic information. Key things to configure are:

Drill memory: Determine the amount of heap and direct memory allocated to a Drillbit for query processing in a Drill cluster. See Configuring Drill Memory.
Parquet block size: Change the Parquet block size to match the filesystem chunk size. See Configuring the Parquet Block Size.
Resources for a shared Drillbit: Configure queues and parallelization for supporting multiple users sharing a Drillbit. Support separate Drillbits running on different nodes in the cluster. See Configuring Resources for a Shared Drillbit.
Multitenancy: Configure a multitenant cluster to account for resources required for Drill. See Configuring a Multitenant Cluster.
User Impersonation: Configure impersonation to allow a service to act on behalf of a client while performing the action requested by the client. See User Impersonation.
User authentication and encryption: Configure user authentication when you want the identity of a user, before permitting the user access to a process running on a system. See Default Security (Tickets) .
SSL/TLS for Encryption: Enable and configure SSL/TLS for encryption when you need to use Plain authentication. See SSL/TLS for Encryption.
Drill impersonation with Hive authorization: Configure Drill impersonation to work with Hive impersonation to authorize access to metadata in the Hive metastore repository and data in the Hive warehouse. See User Impersonation with Hive.
Volumes to use for spooling: Use the drill.exec.spill.directories option to set MapReduce volumes or local volumes for spooling to improve performance and stripe data across as many disks as possible.
Persistent configuration storage: See Persistent Configuration Storage and Configuring the ZooKeeper PStore Location.
Access rights: Configure access rights if you have 777 file-level permissions to a table, and a query returns no results. See Configuring Access Rights.

Drill typically runs along side other workloads, including the following:

MapReduce
Yarn
Hive and Pig
Spark

You need to plan and configure these resources for use with Drill and other workloads:

Memory
CPU
Disk

Configuring Access Rights

If the security in your organization limits access to HPE Ezmeral Data Fabric Database tables, you might experience a problem querying the tables. If you have 777 file-level permissions to a table, yet a query returns no results, you might need to add your user name to the maprcli Access Control List (ACL).

HPE Ezmeral Data Fabric – Customer-Managed 7.9.0 Documentation
Abstract	This site contains documentation for the customer-managed platform of the HPE Ezmeral Data Fabric version 7.9.0 including installation, configuration, administration, and reference content, as well as content for the associated bundled ecosystem components and drivers.
Published	April 2025
Edition	7.9.0
Topic last updated	2020-07-09