Configuring the Hive Storage Plugin
About this task
Drill can work with only one version of Hive in a given cluster. To access Hive
tables using custom SerDes or InputFormat/OutputFormat, all nodes running Drill must
have the SerDes or InputFormat/OutputFormat JAR files in the
<drill_installation_directory>/jars/3rdparty
location.
configure.sh
. If the Hive storage
plugin is disabled, and the configuration in the Drill Web UI displays “null,” you
must rerun configure.sh
with the -hiveMetastoreHost
argument. See configure.sh for details. Configuring a Hive Remote Metastore
{
"type": "hive",
"enabled": true,
"configProps": {
"hive.metastore.uris": "",
"javax.jdo.option.ConnectionURL": "jdbc:derby:;databaseName=../sample-data/drill_hive_db;create=true",
"hive.metastore.warehouse.dir": "/tmp/drill_hive_wh",
"fs.default.name": "file:///",
"hive.metastore.sasl.enabled": "false",
"datanucleus.schema.autoCreateAll": "true"
}
}
Complete the following steps to modify the default Hive storage plugin configuration for your file system environment:
Procedure
- Verify that Hive is running.
-
Issue the following command to start the Hive metastore service on the system
specified in the
hive.metastore.uris
:hive --service metastore
- Start the Drill Web UI.
- Select the Storage tab. If Web UI security is enabled, you must have administrator privileges to perform this step.
- In the list of disabled storage plugins in the Drill Web UI, click Update next to Hive.
-
Update the following Hive storage plugin parameters to match the system
environment:
"hive.metstore.uris"
"jdbc:<database>://<host:port>/<metastore database>"
- Change the default location of files to suit your environment. For example,
change
"fs.default.name": "file:///"
to the file system location:maprfs:///
- To run Drill and Hive in a secure cluster,
change the
"hive.metastore.sasl.enabled"
parameter to"true"
. - Change the
"datanucleus.schema.autoCreateAll"
property setting for your system environment. After it is enabled,"datanucleus.schema.autoCreateAll"
initializes the Hive metastore schema.- In a production environment, remove the
"datanucleus.schema.autoCreateAll"
property from the Hive storage plugin configuration; the property is not required because the preferred schema information is already created for the Hive metastore service. - In a test environment with an embedded Hive metastore, you can disable
(set to
false
) this property after the first query on the Hive data source that you submit from Drill. Alternatively, use the Hive schema tool to initialize or upgrade the Hive metastore schema. Using the Hive schema tool is recommended for queries on transactional tables. Run theschematool
command as an initialization step:/opt/mapr/hive/hive-<version>/bin/schematool -dbType <databaseType> -initSchema
- In a production environment, remove the
- Click Enable in the Web UI to enable the Hive storage plugin configuration.