Configuring Drill
Lists the data-fabric-specific configuration for Drill.
Drill is highly configurable. This document focuses on data-fabric-related configurations and refers to the open source
Apache Drill
documentation for generic information. Key things to configure are:
- Drill memory
- Determine the amount of heap and direct memory allocated to a Drillbit for query processing in a Drill cluster. See Configuring Drill Memory.
- Parquet block size
- Change the Parquet block size to match the filesystem chunk size. See Configuring the Parquet Block Size.
- Resources for a shared Drillbit
- Configure queues and parallelization for supporting multiple users sharing a Drillbit. Support separate Drillbits running on different nodes in the cluster. See Configuring Resources for a Shared Drillbit.
- Multitenancy
- Configure a multitenant cluster to account for resources required for Drill. See Configuring a Multitenant Cluster.
- User Impersonation
- Configure impersonation to allow a service to act on behalf of a client while performing the action requested by the client. See User Impersonation.
- User authentication and encryption
- Configure user authentication when you want the identity of a user, before permitting the user access to a process running on a system. See Default Security (Tickets) .
- SSL/TLS for Encryption
- Enable and configure SSL/TLS for encryption when you need to use Plain authentication. See SSL/TLS for Encryption.
- Drill impersonation with Hive authorization
- Configure Drill impersonation to work with Hive impersonation to authorize access to metadata in the Hive metastore repository and data in the Hive warehouse. See User Impersonation with Hive.
- Volumes to use for spooling
- Use the drill.exec.spill.directories option to set MapReduce volumes or local volumes for spooling to improve performance and stripe data across as many disks as possible.
- Persistent configuration storage
- See Persistent Configuration Storage and Configuring the ZooKeeper PStore Location.
- Access rights
- Configure access rights if you have 777 file-level permissions to a table, and a query returns no results. See Configuring Access Rights.
Drill typically runs along side other workloads, including the following:
- MapReduce
- Yarn
- Hive and Pig
- Spark
You need to plan and configure these resources for use with Drill and other workloads:
- Memory
- CPU
- Disk
Configuring Access Rights
If the security in your organization limits access to HPE Ezmeral Data Fabric Database tables, you might experience a problem querying the tables. If you have 777 file-level permissions to a table, yet a query returns no results, you might need to add your user name to the maprcli Access Control List (ACL).