Hive Connection Parameters

List of Hive connection parameters, descriptions, default values, and supported data types.

If you want to connect HPE Ezmeral Unified Analytics Software to a Hive data source that uses Kerberos for authentication, see Configuring a Hive Data Source with Kerberos Authentication.

The following sections list the required and optional Hive connection parameters.
NOTE
Hive connector values varies based on type of metastore. See https://prestodb.io/docs/current/connector/hive.html.

Required Connection Parameters

The following table lists the required connection parameters:
Parameter Description Default Value Data Type
Hive Metastore The type of Hive metastore to use. thrift STRING
Enable Local Snapshot Table Enable Caching while querying. true BOOLEAN

Optional Connection Parameters

The following table lists the optional connection parameters:
Parameter Description Default Value Data Type
Hive Insert Overwrite Immutable Partitions Enabled When enabled, insertion query will overwrite existing partitions when partitions are immutable. This config only takes effect when Hive Immutable Partitions is set to true. false BOOLEAN
Hive Create Empty Bucket Files For Temporary Table Create empty files when there is no data for temporary table buckets. false BOOLEAN
Hive Enable Parquet Batch Reader Verification Enable optimized parquet reader. false BOOLEAN
Hive Create Empty Bucket Files For Temporary Table Create empty files when there is no data for temporary table buckets. false BOOLEAN
Hive Min Bucket Count To Not Ignore Table Bucketing Ignore table bucketing when table bucket count is less than the value specified, otherwise, it is controlled by property hive.ignore-table-bucketing. 0 INTEGER
Hive Partition Statistics Based Optimization Enabled Enables partition statistics based optimization, including partition pruning and predicate stripping. false BOOLEAN
Hive Experimental Optimized Partition Update Serialization Enabled Serialize PartitionUpdate objects using binary SMILE encoding and compress with the ZSTD compression. false BOOLEAN
Hive Materialized View Missing Partitions Threshold Materialized views with missing partitions more than this threshold falls back to the base tables at read time. 100 INTEGER
Hive S3select Pushdown Max Connections The maximum number of client connections allowed for those operations from worker nodes. 500 INTEGER
Hive Temporary Staging Directory Enabled Should use (if possible) temporary staging directory for write operations. true BOOLEAN
Hive Temporary Staging Directory Path Location of temporary staging directory for write operations. Use ${USER} placeholder to use different location for each user. /tmp/presto-${USER} STRING
Hive Temporary Table Storage Format The default file format used when creating new tables. ORC STRING
Hive Temporary Table Compression Codec The compression codec to use when writing files for temporary tables. SNAPPY STRING
Hive Use Pagefile For Hive Unsupported Type Automatically switch to PAGEFILE format for materialized exchange when encountering unsupported types. true BOOLEAN
Hive Parquet Pushdown Filter Enabled Enable complex filter pushdown for Parquet. false BOOLEAN
Hive Range Filters On Subscripts Enabled Enable pushdown of range filters on subscripts (a[2] = 5) into ORC column readers. false BOOLEAN
Hive Adaptive Filter Reordering Enabled Enable adaptive filter reordering. true BOOLEAN
Hive Parquet Batch Read Optimization Enabled Is Parquet batch read optimization enabled. false BOOLEAN
Hive Enable Parquet Dereference Pushdown Is dereference pushdown expression pushdown into Parquet reader enabled. false BOOLEAN
Hive Max Metadata Updater Threads Maximum number of metadata updated threads. 100 INTEGER
Hive Partial_aggregation_pushdown_enabled Enable partial aggregation pushdown. false BOOLEAN
Hive Manifest Verification Enabled Enable verification of file names and sizes in manifest / partition parameters. false BOOLEAN
Hive Undo Metastore Operations Enabled Enable undo metastore operations. true BOOLEAN
Hive Verbose Runtime Stats Enabled Enable tracking all runtime stats. Note that this may affect query performance. false BOOLEAN
Hive Prefer Manifests To List Files Prefer to fetch the list of file names and sizes from manifests rather than storage false BOOLEAN
Hive Partition Lease Duration Partition lease duration. 0.00s DURATION
Hive Size Based Split Weights Enabled Enable estimating split weights based on size in bytes true BOOLEAN
Hive Minimum Assigned Split Weight Minimum weight that a split can be assigned when size based split weights are enabled. 0.05 DOUBLE
Hive Use Record Page Source For Custom Split Use record page source for custom split. By default, true. Used to query MOR tables in Hudi. true BOOLEAN
Hive Split Loader Concurrency Number of maximum concurrent threads per split source. 4 INTEGER
Hive Domain Compaction Threshold Maximum ranges to allow in a tuple domain without compacting it. 100 INTEGER
Hive Max Concurrent File Renames Maximum concurrent file renames 20 INTEGER
Hive Max Concurrent Zero Row File Creations Maximum number of zero row file creations. 20 INTEGER
Hive Recursive Directories Enable reading data from subdirectories of table or partition locations. If disabled, subdirectories are ignored. false BOOLEAN
Hive User Defined Type Encoding Enabled Enable user defined type. false BOOLEAN
Hive Loose Memory Accounting Enabled When enabled relaxes memory accounting for queries violating memory limits to run that previously honored memory thresholds. false BOOLEAN
Hive Max Outstanding Splits Size Maximum amount of memory allowed for split buffering for each table scan in a query, before the query is failed. 256MB DATASIZE
Hive Max Split Iterator Threads Maximum number of iterator threads. 1000 INTEGER
Hive Allow Corrupt Writes For Testing Allow Hive connector to write data even when data will likely be corrupt. false BOOLEAN
Hive Create Empty Bucket Files Should empty files be created for buckets that have no data? true BOOLEAN
Hive Max Partitions Per Writers Maximum number of partitions per writer. 100 INTEGER
Hive Write Validation Threads Number of threads used for verifying data after a write. 16 INTEGER
Hive Orc Tiny Stripe Threshold ORC: Threshold below which an ORC stripe or file will read in its entirety. 8MB DATASIZE
Hive Orc Lazy Read Small Ranges ORC read small disk ranges lazily. true BOOLEAN
Hive Orc Bloom Filters Enabled ORC: Enable bloom filters for predicate pushdown. false BOOLEAN
Hive Orc Default Bloom Filter Fpp ORC Bloom filter false positive probability. 0.05 DOUBLE
Hive Orc Optimized Writer Enabled Experimental: ORC: Enable optimized writer. true BOOLEAN
Hive Orc Writer Validation Percentage Percentage of ORC files to validate after write by re-reading the whole file. 0 DOUBLE
Hive Orc Writer Validation Mode Level of detail in ORC validation. Lower levels require more memory. BOTH STRING
Hive Rcfile Optimized Writer Enabled Experimental: RCFile: Enable optimized writer. true BOOLEAN
Hive Assume Canonical Partition Keys Assume canonical parition keys? false BOOLEAN
Hive Parquet Fail On Corrupted Statistics Fail when scanning Parquet files with corrupted statistics. true BOOLEAN
Hive Parquet Max Read Block Size Parquet: Maximum size of a block to read. 16MB DATASIZE
Hive Optimize Mismatched Bucket Count Enable optimization to avoid shuffle when bucket count is compatible but not the same. false BOOLEAN
Hive Zstd Jni Decompression Enabled Use JNI based zstd decompression for reading ORC files. false BOOLEAN
Hive File Status Cache Size Hive file status cache size. 0 LONG
Hive File Status Cache Expire Time Hive file status cache : expiry time. 0.00s DURATION
Hive Per Transaction Metastore Cache Maximum Size Maximum number of metastore data objects in the Hive metastore cache per transaction. 1000 INTEGER
Hive Metastore Refresh Interval Asynchronously refresh cached metastore data after access if it is older than this but is not yet expired, allowing subsequent accesses to see fresh data. 0.00s DURATION
Hive Metastore Cache Maximum Size Maximum number of metastore data objects in the Hive metastore cache. 10000 INTEGER
Hive Metastore Refresh Max Threads Maximum threads used to refresh cached metastore data. 100 INTEGER
Hive Partition Versioning Enabled false BOOLEAN
Hive Metastore Impersonation Enabled Should Presto user be impersonated when communicating with Hive Metastore. false BOOLEAN
Hive Partition Cache Validation Percentage Percentage of partition cache validation. 0 DOUBLE
Hive Metastore Thrift Client Socks Proxy Metastore thrift client socks proxy. null STRING
Hive Metastore Timeout Timeout for Hive metastore requests. 10.00s DURATION
Hive Dfs Verify Checksum Verify checksum for data consistency. true BOOLEAN
Hive Metastore Cache Ttl Duration how long cached metastore data should be considered valid. 0.00s DURATION
Hive Metastore Recording Path Metastore recording path. null STRING
Hive Replay Metastore Recording Replay metastore recording. false BOOLEAN
Hive Metastore Recoding Duration Metastore recording duration. 0.00m DURATION
Hive Dfs Require Hadoop Native Hadoop native is required? true BOOLEAN
Hive Metastore Cache Scope Metastore cache scope. ALL STRING
Hive Metastore Authentication Type Hive metastore authentication type. NONE STRING
Hive Hdfs Authentication Type HDFS authentication type. NONE STRING
Hive Hdfs Impersonation Enabled Should Presto user be impersonated when communicating with HDFS. false BOOLEAN
Hive Hdfs Wire Encryption Enabled Should be turned on when HDFS wire encryption is enabled. false BOOLEAN
Hive Skip Target Cleanup On Rollback Skip deletion of target directories when a metastore operation fails and the write mode is DIRECT_TO_TARGET_NEW_DIRECTORY. false BOOLEAN
Hive Bucket Execution Enable bucket-aware execution: only use a single worker per bucket. true BOOLEAN
Hive Bucket Function Type For Exchange Hash function type for exchange. HIVE_COMPATIBLE STRING
Hive Ignore Unreadable Partition Ignore unreadable partitions and report as warnings instead of failing the query. false BOOLEAN
Hive Max Buckets For Grouped Execution Maximum number of buckets to run with grouped execution. 1000000 INTEGER
Hive Sorted Write To Temp Path Enabled Enable writing temp files to temp path when writing to bucketed sorted tables. false BOOLEAN
Hive Sorted Write Temp Path Subdirectory Count Number of directories per partition for temp files generated by writing sorted table. 10 INTEGER
Hive Fs Cache Max Size Hadoop FileSystem cache size. 1000 INTEGER
Hive Non Managed Table Writes Enabled Enable writes to non-managed (external) tables. false BOOLEAN
Hive Non Managed Table Creates Enabled Enable non-managed (external) table creates. true BOOLEAN
Hive Table Statistics Enabled Enable use of table statistics. true BOOLEAN
Hive Partition Statistics Sample Size Specifies the number of partitions to analyze when computing table statistics. 100 INTEGER
Hive Ignore Corrupted Statistics Ignore corrupted statistics rather than failing. false BOOLEAN
Hive Collect Column Statistics On Write Enables automatic column level statistics collection on write. false BOOLEAN
Hive S3select Pushdown Enabled Enable query pushdown to AWS S3 Select service. false BOOLEAN
Hive Max Initial Splits Max initial splits. 200 INTEGER
Hive Max Initial Split Size Max initial split size. null DATASIZE
Hive Writer Sort Buffer Size Write sort buffer size. 64MB DATASIZE
Hive Node Selection Strategy Node affinity selection strategy. NO_PREFERENCE STRING
Hive Max Split Size Max split size. 64MB DATASIZE
Hive Max Partitions Per Scan Maximum allowed partitions for a single table scan. 100000 INTEGER
Hive Max Outstanding Splits Target number of buffered splits for each table scan in a query, before the scheduler tries to pause itself. 1000 INTEGER
Hive Metastore Partition Batch Size Min Hive metastore : min batch size for partitions. 10 INTEGER
Hive Metastore Partition Batch Size Max Hive metastore : max batch size for partitions. 100 INTEGER
Hive Config Resources An optional comma-separated list of HDFS configuration files. [] FILEPATH
Hive Dfs Ipc Ping Interval The client will send ping when the interval is passed without receiving bytes. 10.00s DURATION
Hive Dfs Timeout DFS timeout. 60.00s DURATION
Hive Dfs Connect Timeout DFS connection timeout. 500.00ms DURATION
Hive Dfs Connect Max Retries DFS - max retries in case of connection issue. 5 INTEGER
Hive Storage Format The default file format used when creating new tables. ORC STRING
Hive Compression Codec The compression codec to use when writing files. GZIP STRING
Hive Orc Compression Codec The preferred compression codec to use when writing ORC and DWRF files. GZIP STRING
Hive Respect Table Format Should new partitions be written using the existing table format or the default PrestoDB format? true BOOLEAN
Hive Immutable Partitions Can new data be inserted into existing partitions? false BOOLEAN
Hive Max Open Sort Files Maximum number of writer temporary files to read in one pass. 50 INTEGER
Hive Dfs Domain Socket Path This is a path in the filesystem that allows the client and the DataNodes to communicate. null STRING
Hive S3 File System Type S3 file system type. PRESTO STRING
Hive Gcs Json Key File Path JSON key file used to access Google Cloud Storage. null FILEPATH
Hive Gcs Use Access Token Use client-provided OAuth token to access Google Cloud Storage. false BOOLEAN
Hive Orc Use Column Names Access ORC columns using names from the file. false BOOLEAN
Hive Orc Max Merge Distance ORC: Maximum size of gap between two reads to merge into a single read 1MB DATASIZE
Hive Orc Max Buffer Size ORC: Maximum size of a single read. 8MB DATASIZE
Hive Orc Stream Buffer Size ORC: Size of buffer for streaming reads. 8MB DATASIZE
Hive Orc Max Read Block Size ORC: Soft max size of Presto blocks produced by ORC reader. 16MB DATASIZE
Hive Rcfile Writer Validate Validate RCFile after write by re-reading the whole file. false BOOLEAN
Hive Text Max Line Length Maximum line length for text files. 100MB DATASIZE
Hive Parquet Use Column Names Access Parquet columns using names from the file. false BOOLEAN
Hive File Status Cache Tables The tables that have file status cache enabled. Setting to '*' includes all tables. STRING
Hive Skip Deletion For Alter Skip deletion of old partition data when a partition is deleted and then inserted in the same transaction. false BOOLEAN
Hive Sorted Writing Enable writing to bucketed sorted tables. true BOOLEAN
Hive Ignore Table Bucketing Ignore table bucketing to enable reading from unbucketed partitions. false BOOLEAN
Hive Temporary Table Schema Schema where to create temporary tables. default STRING
Hive Pushdown Filter Enabled Experimental: enable complex filter pushdown. false BOOLEAN
Hive Pagefile Writer Stripe Max Size PAGEFILE: Max stripe size. 24MB DATASIZE
Hive File_renaming_enabled Enable file renaming. false BOOLEAN
Hive partial_aggregation_pushdown_for_​variable_length_datatypes_enabled Enable partial aggregation pushdown for variable length datatypes. false BOOLEAN
Hive Time Zone Sets the default time zone. null STRING
Hive Orc Writer Stripe Min Size ORC: Min stripe size. 32MB DATASIZE
Hive Orc Writer Stripe Max Size ORC: Max stripe size. 64MB DATASIZE
Hive Orc Writer Stripe Max Rows ORC: Max stripe row count. 10000000 INTEGER
Hive Orc Writer Row Group Max Rows ORC : Max rows in row group. 10000 INTEGER
Hive Orc Writer Dictionary Max Memory ORC: Max dictionary memory. 16MB DATASIZE
Hive Orc Writer String Statistics Limit ORC: Maximum size of string statistics; drop if exceeding. 64B DATASIZE
Hive Orc Writer Stream Layout Type ORC: Stream layout type. BY_COLUMN_SIZE STRING
Hive Orc Writer Dwrf Stripe Cache Mode Describes content of the DWRF stripe metadata cache. INDEX_AND_FOOTER STRING
Hive Orc Writer Max Compression Buffer Size ORC : Max compression buffer size. 256kB DATASIZE
Hive Orc Writer Dwrf Stripe Cache Enabled DWRF stripe cache enabled? false BOOLEAN
Hive Orc Writer Dwrf Stripe Cache Max Size DWRF stripe cache max size. 8MB DATASIZE
Hive Parquet Optimized Writer Enabled Parquet: Optimized writer enabled? false BOOLEAN
Hive Parquet Writer Block Size Parquet: Writer block size. 134217728B DATASIZE
Hive Parquet Writer Page Size Parquet: Writer page size. 1048576B DATASIZE
Hive Security The type of access control to use. legacy STRING
Generic Cache Table Ttl TTL for cache table expiry in minutes. 1440 INTEGER