ACID Table Upgrade Routine
Contains a procedure that must be followed if your installation of Hive 2.x includes ACID tables and you want to upgrade from Hive 2.x to 3.x. If Hive is upgraded from 2.x to 3.x without performing these steps, data in the ACID tables will be corrupted during the upgrade.
Prerequisites
The following steps assume:
- A cluster with release 7.0.0 and EEP 8.1.0.
- The cluster is running Hive 2.3 and Hadoop 2.7.
- Derby is not used as the Hive Metastore backend database.
- The Hive Upgrade ACID Tool JAR has been downloaded to the Hive 2.x installation node.
Considerations for Running the Tool
Note these considerations:
- You must run the Upgrade ACID Tool before upgrading any cluster package.
- You must run the Upgrade ACID Tool on a live cluster.
- Before running the Upgrade ACID Tool, stop the
hs2
service to ensure no access is permitted during the upgrade tool run.
ACID Table Upgrade Steps
Use these steps to run the tool:
- Stop the
hs2
service:$ maprcli node services -action stop -nodes `hostname -f` -name hs2;
- Run the Upgrade ACID Tool. Modify the following paths in the run command to match the
environment:
Here is the command syntax:Path Description /opt/mapr/hive/hive-<old_hive_version>
Path to the Hive 2.x installation /opt/mapr/hadoop/hadoop-<old_hadoop_version>
Path to the Hadoop 2.7 installation <path_to>/hive-upgrade-acid-<new_hive_version>-eep-900.jar
Path to the upgrade tool JAR file
If the path values are as follows:$ java -cp /opt/mapr/lib/*:/opt/mapr/hive/hive-<old_hive_version>/lib/*:/opt/mapr/hive/hive-<old_hive_version>/conf/*:/opt/mapr/hadoop/hadoop-<old_hadoop_version>/lib/*:/opt/mapr/hadoop/hadoop-<old_hadoop_version>/etc/hadoop/*:/opt/mapr/hadoop/hadoop-<old_hadoop_version>/share/hadoop/yarn/sources/*:/opt/mapr/hadoop/hadoop-<old_hadoop_version>/share/hadoop/mapreduce/*:/opt/mapr/hadoop/hadoop-<old_hadoop_version>/share/hadoop/mapreduce/sources/*:/opt/mapr/hadoop/hadoop-<old_hadoop_version>/share/hadoop/hdfs/*:/opt/mapr/hadoop/hadoop-<old_hadoop_version>/share/hadoop/hdfs/sources/*:/home/mapr/hive-upgrade-acid-<new_hive_version>-eep-900.jar org.apache.hadoop.hive.upgrade.acid.UpgradeTool -preUpgrade -execute
Here's an example:Path Description /opt/mapr/hive/hive-2.3
Path to the Hive 2.x installation opt/mapr/hadoop/hadoop-2.7.6
Path to the Hadoop 2.7 installation /home/mapr/hive-upgrade-acid-3.1.3.0-eep-900.jar
Path to the upgrade tool JAR file
Note that the$ java -cp /opt/mapr/lib/*:/opt/mapr/hive/hive-2.3/lib/*:/opt/mapr/hive/hive-2.3/conf/*:/opt/mapr/hadoop/hadoop-2.7.6/lib/*:/opt/mapr/hadoop/hadoop-2.7.6/etc/hadoop/*:/opt/mapr/hadoop/hadoop-2.7.6/share/hadoop/yarn/sources/*:/opt/mapr/hadoop/hadoop-2.7.6/share/hadoop/mapreduce/*:/opt/mapr/hadoop/hadoop-2.7.6/share/hadoop/mapreduce/sources/*:/opt/mapr/hadoop/hadoop-2.7.6/share/hadoop/hdfs/*:/opt/mapr/hadoop/hadoop-2.7.6/share/hadoop/hdfs/sources/*:/home/mapr/acid-test/hive-upgrade-acid-3.1.3.0-eep-900.jar org.apache.hadoop.hive.upgrade.acid.UpgradeTool -preUpgrade -execute
-preUpgrade
and-execute
flags are mandatory. - Continue the cluster and Hive upgrade procedures. At this point, the ACID tables are ready to use by Hive 3.x, and no further ACIDupgrade actions are required.
Troubleshooting
This section addresses common troubleshooting scenarios during the ACID table upgrade operation:
- Problem
- The Hive Upgrade ACID Tool finishes almost instantly with the following log messages
(the example log is trimmed for
readability):
INFO [main] acid.UpgradeTool - No compaction is necessary INFO [main] acid.UpgradeTool - No acid conversion is necessary INFO [main] acid.UpgradeTool - No managed table conversion is necessary INFO [main] acid.UpgradeTool - No file renaming is necessary
- Solution
- These log messages are not necessarily a problem. It is possible that even though ACID tables are present, the upgrade tool decided these tables do not need any upgrade modifications.
- Problem
- The Hive Upgrade ACID Tool fails with the following
error:
java.lang.NoClassDefFoundError
- Solution
- Make sure all paths in the
run
command are specified correctly and exist in the file system.
- Problem
- The Hive Upgrade ACID Tool fails with the following
error:
Error: Could not find or load main class org.apache.hadoop.hive.upgrade.acid.UpgradeTool
- Solution
- Make sure that:
- The path to the upgrade tool JAR file is specified correctly.
- The JAR file is included in the classpath option.
- The JAR file exists within the specified path.
- Problem
- The Hive Upgrade ACID Tool fails with the following log messages (the example log is
trimmed for
readability):
ERROR [main] acid.UpgradeTool - UpgradeTool failed java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.AcidUtils.getChildState(AcidUtils.java) at org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java)
- Solution
- Most likely the
run
command does not contain the-execute
flag. Make sure that the-execute
flag contains a preceding dash.
- Problem
- The Hive Upgrade ACID Tool fails with the following log messages (the example log is
trimmed for
readability):
WARN rpcauth.RpcAuthRegistry - No RpcAuthMethod registerd for authentication method CUSTOM ERROR acid.UpgradeTool - UpgradeTool failed java.lang.NullPointerException at org.apache.hadoop.hive.thrift.ThriftTransportHelper.createMapRSaslTransport (ThriftTransportHelper.java) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge25Sasl$Client.createClientTransport (HadoopThriftAuthBridge25Sasl.java) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open (HiveMetaStoreClient.java) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java)
- Solution
- Most likely too many JARs were specified in the classpath. Do not use a command such
as the following to collect JARs for the classpath in the upgrade utility
run
command. Use exactly the classpath values specified in the preceding template:find /opt/mapr -iname "*.jar" | xargs | tr -s ' ' ':'