Configuring Hive and Tez
About this task
To configure Hive on Tez, repeat the following steps on each node where you want to configure Hive on Tez. Tez mode for MR jobs is not compatible with all MR jobs, so do not set up the whole cluster to work on Tez.
There is a known issue related to the incomplete removal of previously installed Tez
packages. The issue affects platforms on which Tez was installed but later removed using
sudo apt-get remove mapr-tez. Because of Ubuntu-specific behavior and Tez
source-code issues, the remove command removes Tez only partially in some
installations. If this happens, an error is generated when you try to re-install Tez on
Ubuntu, as described following in step 1. If you believe your installation might have this
issue, you can prevent the error. Before performing the following steps, use the
purge command to completely remove all previously installed Tez
packages.
Procedure
-
Install Tez if it is already not installed. To install Tez, run the following
command:
On CentOS / RedHat yum install mapr-tezOn SLES zypper install mapr-tezOn Ubuntu apt-get install mapr-tezNOTERepeat this step on each node where you want Hive on Tez to be configured. -
Create the
/apps/tezdirectory on Data Fabric file system.To create, run the following commands:hadoop fs -mkdir /apps hadoop fs -mkdir /apps/tez -
Upload the Tez libraries to the
/tezdirectory on the Data Fabric file system.To upload, run the following commands:hadoop fs -put /opt/mapr/tez/tez-<version> /apps/tez hadoop fs -chmod -R 755 /apps/tez -
Verify the upload.
To verify, run the following command:
hadoop fs -ls /apps/tez/tez-<version> -
Set the Tez environment variables. To set, open the
/opt/mapr/hive/hive-<version>/conf/hive-env.shfile, add the following lines, and save the file:export TEZ_CONF_DIR=/opt/mapr/tez/tez-<version>/conf export TEZ_JARS=/opt/mapr/tez/tez-<version>/*:/opt/mapr/tez/tez-<version>/lib/* export HADOOP_CLASSPATH=$TEZ_CONF_DIR:$TEZ_JARS:$HADOOP_CLASSPATHNOTERepeat this step on each node where you want Hive on Tez to be configured. -
Configure Hive for Tez engine. To configure, open the
/opt/mapr/hive/hive-<version>/conf/hive-site.xmlfile, add the following lines, and save the file.
Add the<property> <name>hive.execution.engine</name> <value>tez</value> </property>hive.exec.pre.hooks,hive.exec.post.hooks, andhive.exec.failure.hooksproperties with valueorg.apache.hadoop.hive.ql.hooks.ATSHookto use the Hive queries page in the Tez UI.NOTEStarting from EEP 7.1.0, the following execution-hooks properties are managed by runningconfigure.shcommand with-Roption.<property> <name>hive.exec.pre.hooks</name> <value>org.apache.hadoop.hive.ql.hooks.ATSHook</value> </property> <property> <name>hive.exec.post.hooks</name> <value>org.apache.hadoop.hive.ql.hooks.ATSHook</value> </property> <property> <name>hive.exec.failure.hooks</name> <value>org.apache.hadoop.hive.ql.hooks.ATSHook</value> </property>NOTERepeat this step on each node where you want Hive on Tez to be configured. -
Run
configure.shwith the-Roption./opt/mapr/server/configure.sh -RNOTEStarting in EEP 6.0.1 and later, Tez should be configured by running the$MAPR_HOME/server/configure.shscript with the-Roption. -
Configure Tez shuffle on a secured cluster:
Refer to Tez Shuffle to configure SSL encryption on shuffle.