Configuring the Hive Storage Plugin
About this task
Drill can work with only one version of Hive in a given cluster. To access Hive
tables using custom SerDes or InputFormat/OutputFormat, all nodes running Drill must
have the SerDes or InputFormat/OutputFormat JAR files in the
<drill_installation_directory>/jars/3rdparty location.
To query across multiple versions of Hive, install each version of Hive on a separate Drill
cluster. You must define separate storage plugins, each corresponding to the specific
Hive version of the metastore.
NOTE
In EEP
6.0, Drill requires Hive version 2.3.3-mapr or later to successfully query
Hive data sources.Configuring a Hive Remote Metastore
A remote Hive metastore configuration runs as a separate service
outside of Hive. The metastore service communicates with the Hive database over JDBC.
Point Drill to the Hive metastore service address, and provide the connection parameters
in the Hive storage plugin configuration to configure a connection to Drill. The Hive
storage plugin (located on the Storage tab in the Drill Web UI)
has the following default configuration if you install Drill:
{
"type": "hive",
"enabled": true,
"configProps": {
"hive.metastore.uris": "",
"javax.jdo.option.ConnectionURL": "jdbc:derby:;databaseName=../sample-data/drill_hive_db;create=true",
"hive.metastore.warehouse.dir": "/tmp/drill_hive_wh",
"fs.default.name": "file:///",
"hive.metastore.sasl.enabled": "false",
"datanucleus.schema.autoCreateAll": "true"
}
}
Complete the following steps to modify the default Hive storage plugin configuration for your file system environment: