Hive Connection Parameters
List of Hive connection parameters, descriptions, default values, and supported data types.
If you want to connect HPE Ezmeral Unified Analytics Software to a Hive data source that uses Kerberos for authentication, see Configuring a Hive Data Source with Kerberos Authentication.
The following sections list the required and optional Hive connection parameters.
NOTE
Hive
connector values varies based on type of metastore. See https://prestodb.io/docs/current/connector/hive.html.Required Connection Parameters
The following table lists the required connection parameters:
Parameter | Description | Default Value | Data Type |
---|---|---|---|
Hive Metastore | The type of Hive metastore to use. | thrift | STRING |
Enable Local Snapshot Table | Enable Caching while querying. | true | BOOLEAN |
Optional Connection Parameters
The following table lists the optional connection
parameters:
Parameter | Description | Default Value | Data Type |
---|---|---|---|
Hive Insert Overwrite Immutable Partitions Enabled | When enabled, insertion query will overwrite existing partitions when partitions are immutable. This config only takes effect when Hive Immutable Partitions is set to true. | false | BOOLEAN |
Hive Create Empty Bucket Files For Temporary Table | Create empty files when there is no data for temporary table buckets. | false | BOOLEAN |
Hive Enable Parquet Batch Reader Verification | Enable optimized parquet reader. | false | BOOLEAN |
Hive Create Empty Bucket Files For Temporary Table | Create empty files when there is no data for temporary table buckets. | false | BOOLEAN |
Hive Min Bucket Count To Not Ignore Table Bucketing | Ignore table bucketing when table bucket count is less than the value specified, otherwise, it is controlled by property hive.ignore-table-bucketing. | 0 | INTEGER |
Hive Partition Statistics Based Optimization Enabled | Enables partition statistics based optimization, including partition pruning and predicate stripping. | false | BOOLEAN |
Hive Experimental Optimized Partition Update Serialization Enabled | Serialize PartitionUpdate objects using binary SMILE encoding and compress with the ZSTD compression. | false | BOOLEAN |
Hive Materialized View Missing Partitions Threshold | Materialized views with missing partitions more than this threshold falls back to the base tables at read time. | 100 | INTEGER |
Hive S3select Pushdown Max Connections | The maximum number of client connections allowed for those operations from worker nodes. | 500 | INTEGER |
Hive Temporary Staging Directory Enabled | Should use (if possible) temporary staging directory for write operations. | true | BOOLEAN |
Hive Temporary Staging Directory Path | Location of temporary staging directory for write operations. Use ${USER} placeholder to use different location for each user. | /tmp/presto-${USER} | STRING |
Hive Temporary Table Storage Format | The default file format used when creating new tables. | ORC | STRING |
Hive Temporary Table Compression Codec | The compression codec to use when writing files for temporary tables. | SNAPPY | STRING |
Hive Use Pagefile For Hive Unsupported Type | Automatically switch to PAGEFILE format for materialized exchange when encountering unsupported types. | true | BOOLEAN |
Hive Parquet Pushdown Filter Enabled | Enable complex filter pushdown for Parquet. | false | BOOLEAN |
Hive Range Filters On Subscripts Enabled | Enable pushdown of range filters on subscripts (a[2] = 5) into ORC column readers. | false | BOOLEAN |
Hive Adaptive Filter Reordering Enabled | Enable adaptive filter reordering. | true | BOOLEAN |
Hive Parquet Batch Read Optimization Enabled | Is Parquet batch read optimization enabled. | false | BOOLEAN |
Hive Enable Parquet Dereference Pushdown | Is dereference pushdown expression pushdown into Parquet reader enabled. | false | BOOLEAN |
Hive Max Metadata Updater Threads | Maximum number of metadata updated threads. | 100 | INTEGER |
Hive Partial_aggregation_pushdown_enabled | Enable partial aggregation pushdown. | false | BOOLEAN |
Hive Manifest Verification Enabled | Enable verification of file names and sizes in manifest / partition parameters. | false | BOOLEAN |
Hive Undo Metastore Operations Enabled | Enable undo metastore operations. | true | BOOLEAN |
Hive Verbose Runtime Stats Enabled | Enable tracking all runtime stats. Note that this may affect query performance. | false | BOOLEAN |
Hive Prefer Manifests To List Files | Prefer to fetch the list of file names and sizes from manifests rather than storage | false | BOOLEAN |
Hive Partition Lease Duration | Partition lease duration. | 0.00s | DURATION |
Hive Size Based Split Weights Enabled | Enable estimating split weights based on size in bytes | true | BOOLEAN |
Hive Minimum Assigned Split Weight | Minimum weight that a split can be assigned when size based split weights are enabled. | 0.05 | DOUBLE |
Hive Use Record Page Source For Custom Split | Use record page source for custom split. By default, true. Used to query MOR tables in Hudi. | true | BOOLEAN |
Hive Split Loader Concurrency | Number of maximum concurrent threads per split source. | 4 | INTEGER |
Hive Domain Compaction Threshold | Maximum ranges to allow in a tuple domain without compacting it. | 100 | INTEGER |
Hive Max Concurrent File Renames | Maximum concurrent file renames | 20 | INTEGER |
Hive Max Concurrent Zero Row File Creations | Maximum number of zero row file creations. | 20 | INTEGER |
Hive Recursive Directories | Enable reading data from subdirectories of table or partition locations. If disabled, subdirectories are ignored. | false | BOOLEAN |
Hive User Defined Type Encoding Enabled | Enable user defined type. | false | BOOLEAN |
Hive Loose Memory Accounting Enabled | When enabled relaxes memory accounting for queries violating memory limits to run that previously honored memory thresholds. | false | BOOLEAN |
Hive Max Outstanding Splits Size | Maximum amount of memory allowed for split buffering for each table scan in a query, before the query is failed. | 256MB | DATASIZE |
Hive Max Split Iterator Threads | Maximum number of iterator threads. | 1000 | INTEGER |
Hive Allow Corrupt Writes For Testing | Allow Hive connector to write data even when data will likely be corrupt. | false | BOOLEAN |
Hive Create Empty Bucket Files | Should empty files be created for buckets that have no data? | true | BOOLEAN |
Hive Max Partitions Per Writers | Maximum number of partitions per writer. | 100 | INTEGER |
Hive Write Validation Threads | Number of threads used for verifying data after a write. | 16 | INTEGER |
Hive Orc Tiny Stripe Threshold | ORC: Threshold below which an ORC stripe or file will read in its entirety. | 8MB | DATASIZE |
Hive Orc Lazy Read Small Ranges | ORC read small disk ranges lazily. | true | BOOLEAN |
Hive Orc Bloom Filters Enabled | ORC: Enable bloom filters for predicate pushdown. | false | BOOLEAN |
Hive Orc Default Bloom Filter Fpp | ORC Bloom filter false positive probability. | 0.05 | DOUBLE |
Hive Orc Optimized Writer Enabled | Experimental: ORC: Enable optimized writer. | true | BOOLEAN |
Hive Orc Writer Validation Percentage | Percentage of ORC files to validate after write by re-reading the whole file. | 0 | DOUBLE |
Hive Orc Writer Validation Mode | Level of detail in ORC validation. Lower levels require more memory. | BOTH | STRING |
Hive Rcfile Optimized Writer Enabled | Experimental: RCFile: Enable optimized writer. | true | BOOLEAN |
Hive Assume Canonical Partition Keys | Assume canonical parition keys? | false | BOOLEAN |
Hive Parquet Fail On Corrupted Statistics | Fail when scanning Parquet files with corrupted statistics. | true | BOOLEAN |
Hive Parquet Max Read Block Size | Parquet: Maximum size of a block to read. | 16MB | DATASIZE |
Hive Optimize Mismatched Bucket Count | Enable optimization to avoid shuffle when bucket count is compatible but not the same. | false | BOOLEAN |
Hive Zstd Jni Decompression Enabled | Use JNI based zstd decompression for reading ORC files. | false | BOOLEAN |
Hive File Status Cache Size | Hive file status cache size. | 0 | LONG |
Hive File Status Cache Expire Time | Hive file status cache : expiry time. | 0.00s | DURATION |
Hive Per Transaction Metastore Cache Maximum Size | Maximum number of metastore data objects in the Hive metastore cache per transaction. | 1000 | INTEGER |
Hive Metastore Refresh Interval | Asynchronously refresh cached metastore data after access if it is older than this but is not yet expired, allowing subsequent accesses to see fresh data. | 0.00s | DURATION |
Hive Metastore Cache Maximum Size | Maximum number of metastore data objects in the Hive metastore cache. | 10000 | INTEGER |
Hive Metastore Refresh Max Threads | Maximum threads used to refresh cached metastore data. | 100 | INTEGER |
Hive Partition Versioning Enabled | false | BOOLEAN | |
Hive Metastore Impersonation Enabled | Should Presto user be impersonated when communicating with Hive Metastore. | false | BOOLEAN |
Hive Partition Cache Validation Percentage | Percentage of partition cache validation. | 0 | DOUBLE |
Hive Metastore Thrift Client Socks Proxy | Metastore thrift client socks proxy. | null | STRING |
Hive Metastore Timeout | Timeout for Hive metastore requests. | 10.00s | DURATION |
Hive Dfs Verify Checksum | Verify checksum for data consistency. | true | BOOLEAN |
Hive Metastore Cache Ttl | Duration how long cached metastore data should be considered valid. | 0.00s | DURATION |
Hive Metastore Recording Path | Metastore recording path. | null | STRING |
Hive Replay Metastore Recording | Replay metastore recording. | false | BOOLEAN |
Hive Metastore Recoding Duration | Metastore recording duration. | 0.00m | DURATION |
Hive Dfs Require Hadoop Native | Hadoop native is required? | true | BOOLEAN |
Hive Metastore Cache Scope | Metastore cache scope. | ALL | STRING |
Hive Metastore Authentication Type | Hive metastore authentication type. | NONE | STRING |
Hive Hdfs Authentication Type | HDFS authentication type. | NONE | STRING |
Hive Hdfs Impersonation Enabled | Should Presto user be impersonated when communicating with HDFS. | false | BOOLEAN |
Hive Hdfs Wire Encryption Enabled | Should be turned on when HDFS wire encryption is enabled. | false | BOOLEAN |
Hive Skip Target Cleanup On Rollback | Skip deletion of target directories when a metastore operation fails and the write mode is DIRECT_TO_TARGET_NEW_DIRECTORY. | false | BOOLEAN |
Hive Bucket Execution | Enable bucket-aware execution: only use a single worker per bucket. | true | BOOLEAN |
Hive Bucket Function Type For Exchange | Hash function type for exchange. | HIVE_COMPATIBLE | STRING |
Hive Ignore Unreadable Partition | Ignore unreadable partitions and report as warnings instead of failing the query. | false | BOOLEAN |
Hive Max Buckets For Grouped Execution | Maximum number of buckets to run with grouped execution. | 1000000 | INTEGER |
Hive Sorted Write To Temp Path Enabled | Enable writing temp files to temp path when writing to bucketed sorted tables. | false | BOOLEAN |
Hive Sorted Write Temp Path Subdirectory Count | Number of directories per partition for temp files generated by writing sorted table. | 10 | INTEGER |
Hive Fs Cache Max Size | Hadoop FileSystem cache size. | 1000 | INTEGER |
Hive Non Managed Table Writes Enabled | Enable writes to non-managed (external) tables. | false | BOOLEAN |
Hive Non Managed Table Creates Enabled | Enable non-managed (external) table creates. | true | BOOLEAN |
Hive Table Statistics Enabled | Enable use of table statistics. | true | BOOLEAN |
Hive Partition Statistics Sample Size | Specifies the number of partitions to analyze when computing table statistics. | 100 | INTEGER |
Hive Ignore Corrupted Statistics | Ignore corrupted statistics rather than failing. | false | BOOLEAN |
Hive Collect Column Statistics On Write | Enables automatic column level statistics collection on write. | false | BOOLEAN |
Hive S3select Pushdown Enabled | Enable query pushdown to AWS S3 Select service. | false | BOOLEAN |
Hive Max Initial Splits | Max initial splits. | 200 | INTEGER |
Hive Max Initial Split Size | Max initial split size. | null | DATASIZE |
Hive Writer Sort Buffer Size | Write sort buffer size. | 64MB | DATASIZE |
Hive Node Selection Strategy | Node affinity selection strategy. | NO_PREFERENCE | STRING |
Hive Max Split Size | Max split size. | 64MB | DATASIZE |
Hive Max Partitions Per Scan | Maximum allowed partitions for a single table scan. | 100000 | INTEGER |
Hive Max Outstanding Splits | Target number of buffered splits for each table scan in a query, before the scheduler tries to pause itself. | 1000 | INTEGER |
Hive Metastore Partition Batch Size Min | Hive metastore : min batch size for partitions. | 10 | INTEGER |
Hive Metastore Partition Batch Size Max | Hive metastore : max batch size for partitions. | 100 | INTEGER |
Hive Config Resources | An optional comma-separated list of HDFS configuration files. | [] | FILEPATH |
Hive Dfs Ipc Ping Interval | The client will send ping when the interval is passed without receiving bytes. | 10.00s | DURATION |
Hive Dfs Timeout | DFS timeout. | 60.00s | DURATION |
Hive Dfs Connect Timeout | DFS connection timeout. | 500.00ms | DURATION |
Hive Dfs Connect Max Retries | DFS - max retries in case of connection issue. | 5 | INTEGER |
Hive Storage Format | The default file format used when creating new tables. | ORC | STRING |
Hive Compression Codec | The compression codec to use when writing files. | GZIP | STRING |
Hive Orc Compression Codec | The preferred compression codec to use when writing ORC and DWRF files. | GZIP | STRING |
Hive Respect Table Format | Should new partitions be written using the existing table format or the default PrestoDB format? | true | BOOLEAN |
Hive Immutable Partitions | Can new data be inserted into existing partitions? | false | BOOLEAN |
Hive Max Open Sort Files | Maximum number of writer temporary files to read in one pass. | 50 | INTEGER |
Hive Dfs Domain Socket Path | This is a path in the filesystem that allows the client and the DataNodes to communicate. | null | STRING |
Hive S3 File System Type | S3 file system type. | PRESTO | STRING |
Hive Gcs Json Key File Path | JSON key file used to access Google Cloud Storage. | null | FILEPATH |
Hive Gcs Use Access Token | Use client-provided OAuth token to access Google Cloud Storage. | false | BOOLEAN |
Hive Orc Use Column Names | Access ORC columns using names from the file. | false | BOOLEAN |
Hive Orc Max Merge Distance | ORC: Maximum size of gap between two reads to merge into a single read | 1MB | DATASIZE |
Hive Orc Max Buffer Size | ORC: Maximum size of a single read. | 8MB | DATASIZE |
Hive Orc Stream Buffer Size | ORC: Size of buffer for streaming reads. | 8MB | DATASIZE |
Hive Orc Max Read Block Size | ORC: Soft max size of Presto blocks produced by ORC reader. | 16MB | DATASIZE |
Hive Rcfile Writer Validate | Validate RCFile after write by re-reading the whole file. | false | BOOLEAN |
Hive Text Max Line Length | Maximum line length for text files. | 100MB | DATASIZE |
Hive Parquet Use Column Names | Access Parquet columns using names from the file. | false | BOOLEAN |
Hive File Status Cache Tables | The tables that have file status cache enabled. Setting to '*' includes all tables. | STRING | |
Hive Skip Deletion For Alter | Skip deletion of old partition data when a partition is deleted and then inserted in the same transaction. | false | BOOLEAN |
Hive Sorted Writing | Enable writing to bucketed sorted tables. | true | BOOLEAN |
Hive Ignore Table Bucketing | Ignore table bucketing to enable reading from unbucketed partitions. | false | BOOLEAN |
Hive Temporary Table Schema | Schema where to create temporary tables. | default | STRING |
Hive Pushdown Filter Enabled | Experimental: enable complex filter pushdown. | false | BOOLEAN |
Hive Pagefile Writer Stripe Max Size | PAGEFILE: Max stripe size. | 24MB | DATASIZE |
Hive File_renaming_enabled | Enable file renaming. | false | BOOLEAN |
Hive partial_aggregation_pushdown_for_​variable_length_datatypes_enabled | Enable partial aggregation pushdown for variable length datatypes. | false | BOOLEAN |
Hive Time Zone | Sets the default time zone. | null | STRING |
Hive Orc Writer Stripe Min Size | ORC: Min stripe size. | 32MB | DATASIZE |
Hive Orc Writer Stripe Max Size | ORC: Max stripe size. | 64MB | DATASIZE |
Hive Orc Writer Stripe Max Rows | ORC: Max stripe row count. | 10000000 | INTEGER |
Hive Orc Writer Row Group Max Rows | ORC : Max rows in row group. | 10000 | INTEGER |
Hive Orc Writer Dictionary Max Memory | ORC: Max dictionary memory. | 16MB | DATASIZE |
Hive Orc Writer String Statistics Limit | ORC: Maximum size of string statistics; drop if exceeding. | 64B | DATASIZE |
Hive Orc Writer Stream Layout Type | ORC: Stream layout type. | BY_COLUMN_SIZE | STRING |
Hive Orc Writer Dwrf Stripe Cache Mode | Describes content of the DWRF stripe metadata cache. | INDEX_AND_FOOTER | STRING |
Hive Orc Writer Max Compression Buffer Size | ORC : Max compression buffer size. | 256kB | DATASIZE |
Hive Orc Writer Dwrf Stripe Cache Enabled | DWRF stripe cache enabled? | false | BOOLEAN |
Hive Orc Writer Dwrf Stripe Cache Max Size | DWRF stripe cache max size. | 8MB | DATASIZE |
Hive Parquet Optimized Writer Enabled | Parquet: Optimized writer enabled? | false | BOOLEAN |
Hive Parquet Writer Block Size | Parquet: Writer block size. | 134217728B | DATASIZE |
Hive Parquet Writer Page Size | Parquet: Writer page size. | 1048576B | DATASIZE |
Hive Security | The type of access control to use. | legacy | STRING |
Generic Cache Table Ttl | TTL for cache table expiry in minutes. | 1440 | INTEGER |