Preparing to Upgrade from Hive 2.x to 3.x
Upgrading from Hive 2.x to 3.x requires you to understand data migration, ACID table migration, permissions, folder structures, and artifact naming.
EEP 9.0.0 introduced Hive 3.1.3, while EEP 7.x and 8.x supported Hive 2.3. Any upgrades from EEP 7.x and 8.x to EEP 9.0.0 require a thorough review of the considerations in this topic. For information about the Hive versions in different EEPs, see Component Versions for Released EEPs.
ACID Table Migration
In Hive 3.x, all data – including data in tables, partitions, and UDF functions – is supported as is in Hive 2.x, except for ACID (transactional) tables. ACID tables require some actions before you upgrade from Hive 2.x to 3.x.
Hive 3.x changed the on-disk layout of ACID tables. Any ACID table partition that had an Update, Delete, or Merge statement executed since the last major compaction must execute a major compaction before upgrading to Hive 3.x.
No more Update, Delete, or Merge statements may be executed against these tables after the start of major compaction. Not following this sequence can lead to data corruption. Tables and partitions that contain only results of Insert statements are fully compatible and do not need to be compacted.
For details, see ACID Table Upgrade Routine.
Permission Processing for New Tables
hive.warehouse.subdir.inherit.perms
Instead of the Hive
permission inheritance that was based on the
hive.warehouse.subdir.inherit.perms
parameter setting, Hive 3.x supports
the data-fabric file-system access control model. In Hive 3x, a directory inherits
permissions from the Default
file-system value. All permissions-inheritance
logic has been removed.- 777 - default warehouse directory
- 755 - child directories (no more inheritance)
Table permissions that remain from Hive 2.x are unchanged.
Folder Structure and Versioning
HIVE_HOME
pattern. For example:Hive Version | HIVE_HOME Pattern |
---|---|
2.x | /opt/mapr/hive/hive-2.3 |
3.x | /opt/mapr/hive/hive-3.1.3
|
This change can affect any custom parsing utilities for HIVE_HOME
.
Artifact Naming
hive-A.B.C.D.jar
where:A
is the Major versionB
is the Minor versionC
is the Patch versionD
is the EBF/Release version
This change can affect dependency management in custom applications that refer to Hive 3.x.