Overriding Default maprdb Format Plugin Settings in drill-override.conf
You can override the default maprdb format plugin settings that control the level of
parallelism in Drill and the media type in the drill-override.conf
file.
format-maprdb.json
configuration in the
/opt/mapr/drill/drill-<version>/conf/drill-override.conf
file, as
shown:format-maprdb: {
json: {
scanSizeMB: 512,
restrictedScanSizeMB: 4096
mediaType: HDD
}
}
The following sections describe how to modify the options in the configuration shown above:
Configuring the Level of Parallelism in Drill
The size of data chunks and number of minor fragments affect the level of parallelism in Drill. When querying JSON tables, Drill creates minor fragments that scan the chunks of data that HPE Ezmeral Data Fabric Database passes to Drill. Minor fragments are logical units of work that determine the level of parallelism in Drill. The level of parallelism increases with the number of minor fragments. See Drill Query Execution for more information about how Drill executes a query.
Modifying the Size of Data Chunks
The format-maprdb.json.scanSizeMB
option changes the size of data chunks
that HPE Ezmeral Data Fabric Database passes to Drill when querying JSON tables. Drill
creates approximately one minor fragment per data chunk when querying JSON tables. For
example, if a table has 4 GB of data and the chunk size is set to 128 MB, Drill creates
approximately 32 minor fragments to scan the data chunks.
The default setting for data chunks is 128 MB, however you can override this default in the
drill-override.conf file. The value of the format-maprdb.json.scanSizeMB
option can range from 32 MB to 8192 MB (8 GB). Adjust the setting based on the size of your
tables. Use a higher setting for larger tables and a lower setting for smaller tables. The
right setting can reduce latency and increase throughput.
Modifying the Number of Minor Fragments Created
The format-maprdb.json.restrictedScanSizeMB
option determines the number
of minor fragments that Drill creates to scan the data and do the join-back to a JSON table
when executing a non-covering index plan.
planner.slice_target
option in
Drill determines the number of minor fragments that can run in parallel. See Modifying Query Planning Options for more
information.Configuring the Media Type
Drill is optimized for SSDs, however HPE Ezmeral Data Fabric Database and Drill can
run on HDDs. If you run Drill and HPE Ezmeral Data Fabric Database on HDDs, use the
mediaType option in the drill-override.conf
file to override the default
setting. The mediaType
option accepts HDD or SSD (default) as the value.
Specify SSD or HDD in upper case as the value in the drill-override.conf
file.