Overriding Default maprdb Format Plugin Settings in drill-override.conf

You can override the default maprdb format plugin settings that control the level of parallelism in Drill and the media type in the drill-override.conf file.

To override the default maprdb format plugin settings, add the options to the format-maprdb.json configuration in the /opt/mapr/drill/drill-<version>/conf/drill-override.conf file, as shown:
format-maprdb: {
                json: {
                    scanSizeMB: 512,
                    restrictedScanSizeMB: 4096
                    mediaType: HDD
                    }
                  }

The following sections describe how to modify the options in the configuration shown above:

Configuring the Level of Parallelism in Drill

The size of data chunks and number of minor fragments affect the level of parallelism in Drill. When querying JSON tables, Drill creates minor fragments that scan the chunks of data that HPE Ezmeral Data Fabric Database passes to Drill. Minor fragments are logical units of work that determine the level of parallelism in Drill. The level of parallelism increases with the number of minor fragments. See Drill Query Execution for more information about how Drill executes a query.

Modifying the Size of Data Chunks

The format-maprdb.json.scanSizeMB option changes the size of data chunks that HPE Ezmeral Data Fabric Database passes to Drill when querying JSON tables. Drill creates approximately one minor fragment per data chunk when querying JSON tables. For example, if a table has 4 GB of data and the chunk size is set to 128 MB, Drill creates approximately 32 minor fragments to scan the data chunks.

The default setting for data chunks is 128 MB, however you can override this default in the drill-override.conf file. The value of the format-maprdb.json.scanSizeMB option can range from 32 MB to 8192 MB (8 GB). Adjust the setting based on the size of your tables. Use a higher setting for larger tables and a lower setting for smaller tables. The right setting can reduce latency and increase throughput.

Modifying the Number of Minor Fragments Created

The format-maprdb.json.restrictedScanSizeMB option determines the number of minor fragments that Drill creates to scan the data and do the join-back to a JSON table when executing a non-covering index plan.

The default setting for this option is 4096 MB, however you can override this setting in the drill-override.conf file. The value of this option can range from 32 MB to 8192 MB (8 GB). Adjust the setting based on the size of your tables, keeping in mind that due to the random I/O nature of the join-back, a smaller setting (increased parallelism) may not necessarily increase throughput.
NOTE
The planner.slice_target option in Drill determines the number of minor fragments that can run in parallel. See Modifying Query Planning Options for more information.

Configuring the Media Type

Drill is optimized for SSDs, however HPE Ezmeral Data Fabric Database and Drill can run on HDDs. If you run Drill and HPE Ezmeral Data Fabric Database on HDDs, use the mediaType option in the drill-override.conf file to override the default setting. The mediaType option accepts HDD or SSD (default) as the value. Specify SSD or HDD in upper case as the value in the drill-override.conf file.