Delta Connection Parameters
List of Delta connection parameters, descriptions, default values, and supported data types.
The following sections list the required and optional Delta connection parameters.
Required Connection Parameters
The following table lists the required connection parameters:
NOTE
Delta connector values
varies based on type of metastore. Seehttps://prestodb.io/docs/current/connector/deltalake.html.Parameter | Description | Default Value | Data Type |
---|---|---|---|
Hive Metastore | The type of Hive metastore to use | thrift | STRING |
Enable Local Snapshot Table | Enable Caching while querying | true | BOOLEAN |
Optional Connection Parameters
The following table lists the optional connection parameters:
Parameter | Description | Default Value | Data Type |
---|---|---|---|
Delta Parquet Dereference Pushdown Enabled | Enable pushing nested column dereferences into table scan so that only the required fields selected in a struct data type column are selected | true | BOOLEAN |
Delta Max Splits Batch Size | Delta : Max split batch size | 200 | INTEGER |
Delta Max Partitions Per Writer | Delta : Maximum number of partitions per writer | 100 | INTEGER |
Hive Metastore | The type of Hive metastore to use | thrift | STRING |
Hive Insert Overwrite Immutable Partitions Enabled | When enabled, insertion query will overwrite existing partitions when partitions are immutable. This config only takes effect with hive.immutable-partitions set to true | false | BOOLEAN |
Hive Create Empty Bucket Files For Temporary Table | Create empty files when there is no data for temporary table buckets | false | BOOLEAN |
Hive Enable Parquet Batch Reader Verification | Enable optimized parquet reader | false | BOOLEAN |
Hive Create Empty Bucket Files For Temporary Table | Create empty files when there is no data for temporary table buckets | false | BOOLEAN |
Hive Min Bucket Count To Not Ignore Table Bucketing | Ignore table bucketing when table bucket count is less than the value specified, otherwise, it is controlled by property hive.ignore-table-bucketing | 0 | INTEGER |
Hive Partition Statistics Based Optimization Enabled | Enables partition statistics based optimization, including partition pruning and predicate stripping | false | BOOLEAN |
Hive Experimental Optimized Partition Update Serialization Enabled | Serialize PartitionUpdate objects using binary SMILE encoding and compress with the ZSTD compression | false | BOOLEAN |
Hive Materialized View Missing Partitions Threshold | Materialized views with missing partitions more than this threshold falls back to the base tables at read time | 100 | INTEGER |
Hive S3select Pushdown Max Connections | The maximum number of client connections allowed for those operations from worker nodes | 500 | INTEGER |
Hive Temporary Staging Directory Enabled | Should use (if possible) temporary staging directory for write operations | true | BOOLEAN |
Hive Temporary Staging Directory Path | Location of temporary staging directory for write operations. Use ${USER} placeholder to use different location for each user. | /tmp/presto-${USER} | STRING |
Hive Temporary Table Storage Format | The default file format used when creating new tables. | ORC | STRING |
Hive Temporary Table Compression Codec | The compression codec to use when writing files for temporary tables | SNAPPY | STRING |
Hive Use Pagefile For Hive Unsupported Type | Automatically switch to PAGEFILE format for materialized exchange when encountering unsupported types | true | BOOLEAN |
Hive Parquet Pushdown Filter Enabled | Enable complex filter pushdown for Parquet | false | BOOLEAN |
Hive Range Filters On Subscripts Enabled | Enable pushdown of range filters on subscripts (a[2] = 5) into ORC column readers | false | BOOLEAN |
Hive Adaptive Filter Reordering Enabled | Enable adaptive filter reordering | true | BOOLEAN |
Hive Parquet Batch Read Optimization Enabled | Is Parquet batch read optimization enabled | false | BOOLEAN |
Hive Enable Parquet Dereference Pushdown | Is dereference pushdown expression pushdown into Parquet reader enabled | false | BOOLEAN |
Hive Max Metadata Updater Threads | Maximum number of metadata updated threads | 100 | INTEGER |
Hive Partial_aggregation_pushdown_enabled | Enable partial aggregation pushdown | false | BOOLEAN |
Hive Manifest Verification Enabled | Enable verification of file names and sizes in manifest / partition parameters | false | BOOLEAN |
Hive Undo Metastore Operations Enabled | Enable undo metastore operations | true | BOOLEAN |
Hive Verbose Runtime Stats Enabled | Enable tracking all runtime stats. Note that this may affect query performance | false | BOOLEAN |
Hive Prefer Manifests To List Files | Prefer to fetch the list of file names and sizes from manifests rather than storage | false | BOOLEAN |
Hive Partition Lease Duration | Partition lease duration | 0.00s | DURATION |
Hive Size Based Split Weights Enabled | Enable estimating split weights based on size in bytes | true | BOOLEAN |
Hive Minimum Assigned Split Weight | Minimum weight that a split can be assigned when size based split weights are enabled | 0.05 | DOUBLE |
Hive Use Record Page Source For Custom Split | Use record page source for custom split. By default, true. Used to query MOR tables in Hudi. | true | BOOLEAN |
Hive Split Loader Concurrency | Number of maximum concurrent threads per split source | 4 | INTEGER |
Hive Domain Compaction Threshold | Maximum ranges to allow in a tuple domain without compacting it | 100 | INTEGER |
Hive Max Concurrent File Renames | Maximum concurrent file renames | 20 | INTEGER |
Hive Max Concurrent Zero Row File Creations | Maximum number of zero row file creations | 20 | INTEGER |
Hive Recursive Directories | Enable reading data from subdirectories of table or partition locations. If disabled, subdirectories are ignored. | false | BOOLEAN |
Hive User Defined Type Encoding Enabled | Enable user defined type | false | BOOLEAN |
Hive Loose Memory Accounting Enabled | When enabled relaxes memory accounting for queries violating memory limits to run that previously honored memory thresholds | false | BOOLEAN |
Hive Max Outstanding Splits Size | Maximum amount of memory allowed for split buffering for each table scan in a query, before the query is failed | 256MB | DATASIZE |
Hive Max Split Iterator Threads | Maximum number of iterator threads | 1000 | INTEGER |
Hive Allow Corrupt Writes For Testing | Allow Hive connector to write data even when data will likely be corrupt | false | BOOLEAN |
Hive Create Empty Bucket Files | Should empty files be created for buckets that have no data? | true | BOOLEAN |
Hive Max Partitions Per Writers | Maximum number of partitions per writer | 100 | INTEGER |
Hive Write Validation Threads | Number of threads used for verifying data after a write | 16 | INTEGER |
Hive Orc Tiny Stripe Threshold | ORC: Threshold below which an ORC stripe or file will read in its entirety | 8MB | DATASIZE |
Hive Orc Lazy Read Small Ranges | ORC read small disk ranges lazily | true | BOOLEAN |
Hive Orc Bloom Filters Enabled | ORC: Enable bloom filters for predicate pushdown | false | BOOLEAN |
Hive Orc Default Bloom Filter Fpp | ORC Bloom filter false positive probability | 0.05 | DOUBLE |
Hive Orc Optimized Writer Enabled | Experimental: ORC: Enable optimized writer | true | BOOLEAN |
Hive Orc Writer Validation Percentage | Percentage of ORC files to validate after write by re-reading the whole file | 0.0 | DOUBLE |
Hive Orc Writer Validation Mode | Level of detail in ORC validation. Lower levels require more memory | BOTH | STRING |
Hive Rcfile Optimized Writer Enabled | Experimental: RCFile: Enable optimized writer | true | BOOLEAN |
Hive Assume Canonical Partition Keys | Assume canonical parition keys? | false | BOOLEAN |
Hive Parquet Fail On Corrupted Statistics | Fail when scanning Parquet files with corrupted statistics | true | BOOLEAN |
Hive Parquet Max Read Block Size | Parquet: Maximum size of a block to read | 16MB | DATASIZE |
Hive Optimize Mismatched Bucket Count | Enable optimization to avoid shuffle when bucket count is compatible but not the same | false | BOOLEAN |
Hive Zstd Jni Decompression Enabled | Use JNI based zstd decompression for reading ORC files | false | BOOLEAN |
Hive File Status Cache Size | Hive file status cache size | 0 | LONG |
Hive File Status Cache Expire Time | Hive file status cache : expiry time | 0.00s | DURATION |
Hive Per Transaction Metastore Cache Maximum Size | Maximum number of metastore data objects in the Hive metastore cache per transaction | 1000 | INTEGER |
Hive Metastore Refresh Interval | Asynchronously refresh cached metastore data after access if it is older than this but is not yet expired, allowing subsequent accesses to see fresh data. | 0.00s | DURATION |
Hive Metastore Cache Maximum Size | Maximum number of metastore data objects in the Hive metastore cache | 10000 | INTEGER |
Hive Metastore Refresh Max Threads | Maximum threads used to refresh cached metastore data | 100 | INTEGER |
Hive Partition Versioning Enabled | false | BOOLEAN | |
Hive Metastore Impersonation Enabled | Should Presto user be impersonated when communicating with Hive Metastore | false | BOOLEAN |
Hive Partition Cache Validation Percentage | Percentage of partition cache validation | 0.0 | DOUBLE |
Hive Metastore Thrift Client Socks Proxy | Metastore thrift client socks proxy | null | STRING |
Hive Metastore Timeout | Timeout for Hive metastore requests | 10.00s | DURATION |
Hive Dfs Verify Checksum | Verify checksum for data consistency | true | BOOLEAN |
Hive Metastore Cache Ttl | Duration how long cached metastore data should be considered valid | 0.00s | DURATION |
Hive Metastore Recording Path | Metastore recording path | null | STRING |
Hive Replay Metastore Recording | Replay metastore recording | false | BOOLEAN |
Hive Metastore Recoding Duration | Metastore recording duration | 0.00m | DURATION |
Hive Dfs Require Hadoop Native | Hadoop native is required? | true | BOOLEAN |
Hive Metastore Cache Scope | Metastore cache scope | ALL | STRING |
Hive Metastore Authentication Type | Hive metastore authentication type. | NONE | STRING |
Hive Hdfs Authentication Type | HDFS authentication type. | NONE | STRING |
Hive Hdfs Impersonation Enabled | Should Presto user be impersonated when communicating with HDFS | false | BOOLEAN |
Hive Hdfs Wire Encryption Enabled | Should be turned on when HDFS wire encryption is enabled | false | BOOLEAN |
Hive Skip Target Cleanup On Rollback | Skip deletion of target directories when a metastore operation fails and the write mode is DIRECT_TO_TARGET_NEW_DIRECTORY | false | BOOLEAN |
Hive Bucket Execution | Enable bucket-aware execution: only use a single worker per bucket | true | BOOLEAN |
Hive Bucket Function Type For Exchange | Hash function type for exchange | HIVE_COMPATIBLE | STRING |
Hive Ignore Unreadable Partition | Ignore unreadable partitions and report as warnings instead of failing the query | false | BOOLEAN |
Hive Max Buckets For Grouped Execution | Maximum number of buckets to run with grouped execution | 1000000 | INTEGER |
Hive Sorted Write To Temp Path Enabled | Enable writing temp files to temp path when writing to bucketed sorted tables | false | BOOLEAN |
Hive Sorted Write Temp Path Subdirectory Count | Number of directories per partition for temp files generated by writing sorted table | 10 | INTEGER |
Hive Fs Cache Max Size | Hadoop FileSystem cache size | 1000 | INTEGER |
Hive Non Managed Table Writes Enabled | Enable writes to non-managed (external) tables | false | BOOLEAN |
Hive Non Managed Table Creates Enabled | Enable non-managed (external) table creates | true | BOOLEAN |
Hive Table Statistics Enabled | Enable use of table statistics | true | BOOLEAN |
Hive Partition Statistics Sample Size | Specifies the number of partitions to analyze when computing table statistics. | 100 | INTEGER |
Hive Ignore Corrupted Statistics | Ignore corrupted statistics rather than failing | false | BOOLEAN |
Hive Collect Column Statistics On Write | Enables automatic column level statistics collection on write | false | BOOLEAN |
Hive S3select Pushdown Enabled | Enable query pushdown to AWS S3 Select service | false | BOOLEAN |
Hive Max Initial Splits | Max initial splits | 200 | INTEGER |
Hive Max Initial Split Size | Max initial split size | null | DATASIZE |
Hive Writer Sort Buffer Size | Write sort buffer size | 64MB | DATASIZE |
Hive Node Selection Strategy | Node affinity selection strategy | NO_PREFERENCE | STRING |
Hive Max Split Size | Max split size | 64MB | DATASIZE |
Hive Max Partitions Per Scan | Maximum allowed partitions for a single table scan | 100000 | INTEGER |
Hive Max Outstanding Splits | Target number of buffered splits for each table scan in a query, before the scheduler tries to pause itself | 1000 | INTEGER |
Hive Metastore Partition Batch Size Min | Hive metastore : min batch size for partitions | 10 | INTEGER |
Hive Metastore Partition Batch Size Max | Hive metastore : max batch size for partitions | 100 | INTEGER |
Hive Config Resources | An optional comma-separated list of HDFS configuration files | [] | FILEPATH |
Hive Dfs Ipc Ping Interval | The client will send ping when the interval is passed without receiving bytes | 10.00s | DURATION |
Hive Dfs Timeout | DFS timeout | 60.00s | DURATION |
Hive Dfs Connect Timeout | DFS connection timeout | 500.00ms | DURATION |
Hive Dfs Connect Max Retries | DFS - max retries in case of connection issue | 5 | INTEGER |
Hive Storage Format | The default file format used when creating new tables. | ORC | STRING |
Hive Compression Codec | The compression codec to use when writing files | GZIP | STRING |
Hive Orc Compression Codec | The preferred compression codec to use when writing ORC and DWRF files | GZIP | STRING |
Hive Respect Table Format | Should new partitions be written using the existing table format or the default PrestoDB format? | true | BOOLEAN |
Hive Immutable Partitions | Can new data be inserted into existing partitions? | false | BOOLEAN |
Hive Max Open Sort Files | Maximum number of writer temporary files to read in one pass | 50 | INTEGER |
Hive Dfs Domain Socket Path | This is a path in the filesystem that allows the client and the DataNodes to communicate. | null | STRING |
Hive S3 File System Type | s3 file system type | PRESTO | STRING |
Hive Gcs Json Key File Path | JSON key file used to access Google Cloud Storage | null | FILEPATH |
Hive Gcs Use Access Token | Use client-provided OAuth token to access Google Cloud Storage | false | BOOLEAN |
Hive Orc Use Column Names | Access ORC columns using names from the file | false | BOOLEAN |
Hive Orc Max Merge Distance | ORC: Maximum size of gap between two reads to merge into a single read | 1MB | DATASIZE |
Hive Orc Max Buffer Size | ORC: Maximum size of a single read | 8MB | DATASIZE |
Hive Orc Stream Buffer Size | ORC: Size of buffer for streaming reads | 8MB | DATASIZE |
Hive Orc Max Read Block Size | ORC: Soft max size of Presto blocks produced by ORC reader | 16MB | DATASIZE |
Hive Rcfile Writer Validate | Validate RCFile after write by re-reading the whole file | false | BOOLEAN |
Hive Text Max Line Length | Maximum line length for text files | 100MB | DATASIZE |
Hive Parquet Use Column Names | Access Parquet columns using names from the file | false | BOOLEAN |
Hive File Status Cache Tables | The tables that have file status cache enabled. Setting to '*' includes all tables | STRING | |
Hive Skip Deletion For Alter | Skip deletion of old partition data when a partition is deleted and then inserted in the same transaction | false | BOOLEAN |
Hive Sorted Writing | Enable writing to bucketed sorted tables | true | BOOLEAN |
Hive Ignore Table Bucketing | Ignore table bucketing to enable reading from unbucketed partitions | false | BOOLEAN |
Hive Temporary Table Schema | Schema where to create temporary tables | default | STRING |
Hive Pushdown Filter Enabled | Experimental: enable complex filter pushdown | false | BOOLEAN |
Hive Pagefile Writer Stripe Max Size | PAGEFILE: Max stripe size | 24MB | DATASIZE |
Hive File_renaming_enabled | enable file renaming | false | BOOLEAN |
Hive Partial_aggregation_pushdown_for_variable_length_datatypes_enabled | enable partial aggregation pushdown for variable length datatypes | false | BOOLEAN |
Hive Time Zone | Sets the default time zone | null | STRING |
Hive Orc Writer Stripe Min Size | ORC: Min stripe size | 32MB | DATASIZE |
Hive Orc Writer Stripe Max Size | ORC: Max stripe size | 64MB | DATASIZE |
Hive Orc Writer Stripe Max Rows | ORC: Max stripe row count | 10000000 | INTEGER |
Hive Orc Writer Row Group Max Rows | ORC : Max rows in row group | 10000 | INTEGER |
Hive Orc Writer Dictionary Max Memory | ORC: Max dictionary memory | 16MB | DATASIZE |
Hive Orc Writer String Statistics Limit | ORC: Maximum size of string statistics; drop if exceeding | 64B | DATASIZE |
Hive Orc Writer Stream Layout Type | ORC: Stream layout type | BY_COLUMN_SIZE | STRING |
Hive Orc Writer Dwrf Stripe Cache Mode | Describes content of the DWRF stripe metadata cache. | INDEX_AND_FOOTER | STRING |
Hive Orc Writer Max Compression Buffer Size | ORC : Max compression buffer size | 256kB | DATASIZE |
Hive Orc Writer Dwrf Stripe Cache Enabled | DWRF stripe cache enabled? | false | BOOLEAN |
Hive Orc Writer Dwrf Stripe Cache Max Size | DWRF stripe cache max size | 8MB | DATASIZE |
Hive Parquet Optimized Writer Enabled | Parquet: Optimized writer enabled? | false | BOOLEAN |
Hive Parquet Writer Block Size | Parquet: Writer block size | 134217728B | DATASIZE |
Hive Parquet Writer Page Size | Parquet: Writer page size | 1048576B | DATASIZE |
Hive Security | The type of access control to use | legacy | STRING |
Generic Cache Enabled | Enable Caching while querying | true | BOOLEAN |
Transparent Cache Enabled | Enable transparent caching while querying | true | BOOLEAN |
Generic Cache Table Ttl | TTL for cache table expiry in minutes | 1440 | INTEGER |