Configuring MSCK REPAIR TABLE
This section guides you through configuring MSCK REPAIR TABLE
command
to compare and update the partitions in Hive Metastore and file systems.
Use the MSCK REPAIR TABLE
command to manually update (ADD, DROP, SYNC) the
partitions on Hive metastore with respect to file systems like HDFS, Amazon S3, filesystem,
and others.
For example: You specify the location of filesystem when you create a Hive table. When you add or delete the partitions to or from the filesystem, the partitions in filesystem and Hive metastore becomes inconsistent.
Run
MSCK REPAIR TABLE
command to compare the partitions in filesystem and
the partitions in Hive metastore and update the partitions in Hive metastore.
MSCK [REPAIR] TABLE <table name> [ADD/DROP/SYNC PARTITIONS];
Configure the Hive Metastore with the following Hive property:
Property | Default | Description |
---|---|---|
hive.msck.repair.batch.max.retries | 0 | Maximum number of retries for the msck repair command when adding unknown partitions. If the value is greater than zero it will retry adding unknown partitions until the maximum number of attempts is reached or batch size is reduced to 0, whichever is earlier. In each retry attempt, it will reduce the batch size by a factor of 2 until it reaches zero. If the value is set to zero it will retry until the batch size becomes zero as described above. |