Drill Table Function

Explains Apache Drill’s table functions (table-valued functions - TVFs).

Apache Drill supports table functions, also known as table-valued functions (TVFs). Functions return a table. You can use functions anywhere that you can use a table. Table functions behave similarly to views, but can take parameters.

Syntax

To invoke a table function, you must wrap it with the special TABLE() construct. This wrapper signals that the function returns a table rather than a scalar value:

table(table_function_name())

This construct is case-insensitive.

Drill table functions

Currently, Drill supports two types of table functions.

Schema table function
Format plugin table function

These functions are called using the target table name, but they define different parameters.

The function arguments are passed using the parameter names and the => syntax:

table_name(parameter_name => some_value)

CAUTION

Function parameter names are case-sensitive.

Schema table function

A schema table function allows you to define a schema for an individual query. You can specify a path to the schema or provide the schema directly in the query. This is useful when the table’s schema, which normally resides in the table’s root folder, is located elsewhere, or when you want to reference an alternate schema or define the schema inline.

The schema table function works with tables from any data storage, but schema provisioning is supported only for text files.

Starting with DEP version 10.0.0, schema provisioning is also supported for JSON MapR-DB tables. However, the schema is used only as a hint to resolve ambiguous column types, such as columns that contain only null values. Schema provisioning for JSON MapRDB tables is not a replacement for the CAST function. If you have a column stored as STRING and you want it to be INT – use the CAST function: Data Type Conversion - Apache Drill.

The schema table function has one parameter: schema.

For more details about the schema provisioning feature and the schema table function syntax, see Schema provisioning using a table function.

Parameters

schema - accepts a string value of two kinds: path and inline.

The path value is used to specify a path to the table schema:

table_name(schema => 'path=`/path/to/the/schema`')

For example:

SELECT * 
          FROM TABLE(dfs.tmp.`myTable.cvs`(schema => 'path=`/tmp/schemas/myTableSchema`'))

The inline value is used to provide a schema directly in the query:

table_name(schema => 'inline=(column_name data_type [nullability] [format] [default] [properties {prop='val', ...})]')

For example:

SELECT * 
          FROM TABLE(dfs.tmp.`text_table`(
          schema => 'inline=(col1 date properties {`drill.format` = `yyyy-MM-dd`})
          properties {`drill.strict` = `false`}'))

Format plugin table function

The format plugin table function allows you to override the format options from the file system storage plugin or replace the table’s detected format with your own.

The format specified here takes precedence over the format automatically detected by Drill. If the defined format matches the format detected by Drill, the storage format options from the storage plugin are used as defaults and overridden by any parameters specified in the table function.

Parameters

type - required.The file system table format. Valid values are the names of Drill format plugins. The values are case-sensitive and include: delta, spss, logregex, syslog, parquet, sequencefile, pcap, xml, iceberg, json, ltsv, text, maprdb, avro, image, sas, pcapng, streams, httpd, excel, pdf, shp, hdf5, msaccess.
dynamic parameters - optional. Parameters that correspond to the Drill format plugin configuration variables, excluding extensions (which do not apply in this context).
schema - optional. The value corresponds to the value for the same option in the schema table function.

Examples

Set extractHeader option for a csv file:

SELECT *
FROM TABLE(dfs.tmp.`users.csv`(type => 'text', extractHeader => true))

Set fieldDelimiter, extractHeader, and schema for a csv file:

SELECT *
FROM TABLE(dfs.tmp.`cars.csvh-test`(
  type => 'text', 
  fieldDelimiter => ',', 
  extractHeader => true, 
  schema => 'inline=(col1 date)'))

Use allTextMode for a json file:

SELECT *
FROM TABLE(dfs.tmp.`students`(type => 'json', allTextMode => true))

Dynamic parameters

This is a list of parameters for common formats.

CAUTION

The default value will be used if the parameter wasn't set either in the storage plugin or the format plugin table function.

Table 1. `text`
Parameter name	Default value	Description
`lineDelimiter`	`\n`	A string used as a line delimiter in the text file.
`fieldDelimiter`	`,`	A character used as a field delimiter in the text file.
`quote`	`"`	Defines the character used to enclose text fields that may contain delimiters, line breaks, or other special characters. For example, when `quote` is `"`, a field value such as `"New York, NY"` is treated as a single column value even though it contains a comma.
`escape`	`"`	Specifies the character used to escape a quotation mark inside a quoted field. For example, with `escape` set to `"`, the value `"He said ""Hello"""` will be parsed as `He said "Hello"`.
`comment`	`#`	Identifies the character that marks the beginning of a comment line. Any line starting with this character is ignored during parsing. For example, if `comment` is `#`, then a line beginning with `#` is treated as a comment.
`skipFirstLine`	`false`	A Boolean value that indicates whether the first line in the text file should be omitted.
`extractHeader`	`false`	A Boolean value that indicates whether the first valid record parsed from the text file should be considered as the row containing the names of each column.

Table 2. `json`
Parameter name	Default value	Description
`allTextMode`	The value defined in `store.json.all_text_mode`. Default value is `false`.	A Boolean option that forces all fields in JSON files to be read as `VARCHAR`. This helps prevent schema change errors when field types vary between records.
`readNumbersAsDouble`	The value defined in `store.json.read_numbers_as_double`. Default value is `false`.	A Boolean value. If `true`, all numeric values are read as `DOUBLE`, regardless of whether they include a decimal point.
`skipMalformedJSONRecords`	The value defined in `store.json.reader.skip_invalid_records`. Default value is `false`.	A Boolean value that enables the JSON reader to skip malformed or invalid records in JSON files. When set to `true`, Drill ignores these records and continues processing the remaining data, rather than terminating the query.
`escapeAnyChar`	The value defined in `store.json.reader.allow_escape_any_char`. Default value is `false`.	A Boolean value that enables the JSON record reader to escape any character when processing JSON data.
`nanInf`	The value defined in `store.json.reader.allow_nan_inf`. Default value is `true`.	A Boolean value that enables the JSON record reader to interpret `NaN` and `Infinity` tokens in JSON data as numeric values.

Table 3. `maprdb`
Parameter name	Default value	Description
`allTextMode`	`false`	A Boolean option that forces all fields in the MapR-DB table to be read as `VARCHAR`. This helps prevent schema change errors when field types vary between records.
`enablePushdown`	`true`	Enables query pushdown to the MapR-DB engine. When set to `true`, Drill delegates supported filters and projections directly to MapR-DB for improved performance.
`ignoreSchemaChange`	`false`	Allows queries to continue even when schema changes are detected in MapR-DB tables. When enabled, Drill ignores schema inconsistencies instead of failing the query.
`readAllNumbersAsDouble`	`false`	A Boolean value. If set to true, all numeric values are read as `DOUBLE`, regardless of whether they include a decimal point.
`disableCountOptimization`	`false`	Disables the optimization that uses table metadata to compute `COUNT(*)` queries more efficiently. When set to `true`, Drill performs a full scan instead of using the optimization.
`readTimestampWithZoneOffset`	`false`	When set to `true`, Drill converts timestamp values from UTC to the local time zone.
`nonExistentFieldSupport`	`true`	Enables support for queries that reference fields not present in all documents. When set to `true`, Drill treats missing fields in MapR-DB records as `NULL`.
`index`	`-`	A string value that specifies a hint for the secondary index table to use when executing queries on MapR-DB. If not set, Drill automatically evaluates available indexes, computes their costs, and selects the most efficient one.

HPE Data Fabric 8.0.0 Software Documentation
Abstract	This site contains documentation for HPE Data Fabric Software version 8.0.0 including installation, configuration, administration, and reference content, as well as content for the associated bundled ecosystem components and drivers.
Published	July 2026
Edition	8.0.0
Topic last updated	2026-03-24