Configuring Hive and Tez
About this task
To configure Hive on Tez, repeat the following steps on each node where you want to configure Hive on Tez. Tez mode for MR jobs is not compatible with all MR jobs, so do not set up the whole cluster to work on Tez.
There is a known issue related to the incomplete removal of previously installed Tez
packages. The issue affects platforms on which Tez was installed but later removed using
sudo apt-get remove mapr-tez
. Because of Ubuntu-specific behavior and Tez
source-code issues, the remove
command removes Tez only partially in some
installations. If this happens, an error is generated when you try to re-install Tez on
Ubuntu, as described following in step 1. If you believe your installation might have this
issue, you can prevent the error. Before performing the following steps, use the
purge
command to completely remove all previously installed Tez
packages.
Procedure
-
Install Tez if it is already not installed. To install Tez, run the following
command:
On CentOS / RedHat yum install mapr-tez
On SLES zypper install mapr-tez
On Ubuntu apt-get install mapr-tez
NOTERepeat this step on each node where you want Hive on Tez to be configured. -
Create the
/apps/tez
directory on Data Fabric file system.To create, run the following commands:hadoop fs -mkdir /apps hadoop fs -mkdir /apps/tez
-
Upload the Tez libraries to the
/tez
directory on the Data Fabric file system.To upload, run the following commands:hadoop fs -put /opt/mapr/tez/tez-<version> /apps/tez hadoop fs -chmod -R 755 /apps/tez
-
Verify the upload.
To verify, run the following command:
hadoop fs -ls /apps/tez/tez-<version>
-
Set the Tez environment variables. To set, open the
/opt/mapr/hive/hive-<version>/conf/hive-env.sh
file, add the following lines, and save the file:export TEZ_CONF_DIR=/opt/mapr/tez/tez-<version>/conf export TEZ_JARS=/opt/mapr/tez/tez-<version>/*:/opt/mapr/tez/tez-<version>/lib/* export HADOOP_CLASSPATH=$TEZ_CONF_DIR:$TEZ_JARS:$HADOOP_CLASSPATH
NOTERepeat this step on each node where you want Hive on Tez to be configured. -
Configure Hive for Tez engine. To configure, open the
/opt/mapr/hive/hive-<version>/conf/hive-site.xml
file, add the following lines, and save the file.
Add the<property> <name>hive.execution.engine</name> <value>tez</value> </property>
hive.exec.pre.hooks
,hive.exec.post.hooks
, andhive.exec.failure.hooks
properties with valueorg.apache.hadoop.hive.ql.hooks.ATSHook
to use the Hive queries page in the Tez UI.NOTEStarting from EEP 7.1.0, the following execution-hooks properties are managed by runningconfigure.sh
command with-R
option.<property> <name>hive.exec.pre.hooks</name> <value>org.apache.hadoop.hive.ql.hooks.ATSHook</value> </property> <property> <name>hive.exec.post.hooks</name> <value>org.apache.hadoop.hive.ql.hooks.ATSHook</value> </property> <property> <name>hive.exec.failure.hooks</name> <value>org.apache.hadoop.hive.ql.hooks.ATSHook</value> </property>
NOTERepeat this step on each node where you want Hive on Tez to be configured. -
Run
configure.sh
with the-R
option./opt/mapr/server/configure.sh -R
NOTEStarting in EEP 6.0.1 and later, Tez should be configured by running the$MAPR_HOME/server/configure.sh
script with the-R
option. -
Configure Tez shuffle on a secured cluster:
Refer to Tez Shuffle to configure SSL encryption on shuffle.