Configuring Multiple Drill Clusters and Designating One Cluster as an OJAI Distributed Query Service
As of Core 6.0 and Drill 1.11, you can run operational queries through the OJAI Distributed Query Service, as well as analytical queries through Drill. If you want to run operational and analytical workloads in your cluster, you must configure multiple Drill clusters within the cluster and then configure a Drill cluster as the OJAI Distributed Query Service. Restricting each workload to its own cluster improves query performance.
Data Distribution
If you install both Drill and the OJAI Distributed Query Service through the Installer, both workloads get processed across the entire cluster. When both services run together in the cluster, the system replicates data across the entire cluster, causing remote reads and impairing performance, which can lead to missed SLAs and memory issues.
Memory Allocation
- 8 GB direct
- 4 GB heap
- 1 GB core cache
- 1 GB direct
- 3 GB heap
- 512 MB core cache
If you use the Installer and select both Drill and the OJAI Distributed Query Service, memory is configured for Drill. If you only run operational queries, which do not use as much memory as analytical queries, you unnecessarily lose an additional 8 GB of memory.
How to Run Drill and the OJAI Distributed Query Service Together in a Cluster
You can manually install Drill on several nodes and divide the nodes into multiple topologies (Drill clusters). For each of the topologies, create and mount a volume. Then, create directories within each volume to store your data. Configure these directories as workspaces in the Drill dsf storage plugin. Finally, configure a Drill cluster to run as an OJAI Distributed Query Service.
The following topics provide instructions for each of the required steps: