Using Casts in Secondary Indexes
Defining an index that specifies index keys with CAST functions provides fast access for queries that contain CAST functions. The index converts the indexed field to the type specified by the CAST function and stores the result.
age
field to an INT type and the height
field to a FLOAT
type.
maprcli table index add -path /castTable -index castIdx \
-indexedfields '$CAST(age@INT)','$CAST(height@FLOAT)'
When issuing Drill queries through Business Intelligence (BI) tools, you can include CAST functions in your queries to create Drill views. Including CAST functions provides the metadata needed to optimally process the queries. For more information about using the CAST function with Drill, see Data Type Conversion.
Casting from NULL in Drill
{ "name": null }
Null can also
represent the absence of an actual value, for
example:{ "_id":1,"name":"Annie" } { "_id":2 } (name does not exist)
When you cast on columns with missing values, Drill does not return null for the missing values. Drill only returns null in cases where an actual value of null exists.
a1 | b1 |
1 | 'abc' |
2 | null |
3 | 'null' |
SELECT a1, b1(cast b1 as varchar(20))) from t1;
a1 | b1 |
1 | abc |
2 | |
3 | null |
Guidelines for Using Casts in Indexes
The following rules apply to CAST functions used in secondary indexes:
- You can include the CAST function only on indexed fields.
- You do not have to cast between comparable data types.
- Indexes support casting to the following data types:
- Boolean
- String
- Int
- Long
- Float
- Double
- Date
- Time
- Timestamp
NOTEHPE Ezmeral Data Fabric Database does not support casting from any data type tobyte
,short
,decimal
,binary
, orinterval
.NOTEQueries that use the CAST function on fields withtimestamp
andbinary
data types are not supported. - When casting to a
string
type, you can optionally specify a length. If you do not specify a length, it defaults to the maximum length of 255. - When casting a
float
ordouble
type to astring
type, you cannot control the precision of the digits in the resulting string value.Float
anddouble
are approximate representations of decimal values. - When casting from a
binary
type, HPE Ezmeral Data Fabric Database assumes that the binary value is a UTF-8 formatted string representation of the resulting data type. - If HPE Ezmeral Data Fabric Database cannot cast a value, HPE Ezmeral Data Fabric Database indexes the row with an encoding error that specifies a CAST issue.
- You cannot cast all data types to the supported data types. See the Casting Matrix below for supported and unsupported combinations.
Casting Matrix
int | long | float | double | string | boolean | date | time | timestamp | |
byte | Y | Y | Y | Y | Y | Y | N | N | N |
short | Y | Y | Y | Y | Y | Y | N | N | N |
int | - | Y | Y | Y | Y | Y | Y1 | Y2 | Y |
long | Y | - | Y | Y | Y | Y | Y | Y | Y |
float | Y | Y | - | Y | Y | Y | N | N | N |
double | Y | Y | Y | - | Y | Y | N | N | N |
string | Y | Y | Y | Y | - | Y3 | Y | Y | Y |
boolean | Y | Y | N | N | Y | - | N | N | N |
date | N | N | N | N | Y | N | - | Y | Y |
time | N | N | N | N | Y | N | N | - | N |
timestamp | N | N | N | N | Y4 | N | Y | Y | - |
binary | Y | Y | Y | Y | Y | N | N | N | N |
array | N | N | N | N | N | N | N | N | N |
nested document | N | N | N | N | N | N | N | N | N |
1 When casting int/long to a date type, the date value is constructed based on the int/long value being the number of milliseconds since epoch.
2 When casting int/long to a time type, the time value is constructed based on the int/long value being the time of day in milliseconds.
3
HPE Ezmeral Data Fabric Database casts the strings true
,
yes
, on
, y
, t
, and
1
to boolean true
. HPE Ezmeral Data Fabric Database casts the strings false
, no
, off
,
n
, f
, and 0
to boolean
false
.
4 The string represents the time in UTC timezone.
Example Using Cast Function in an Index
This example shows you how to create an index with the CAST function.
lineitem
and casts the
L_LINENUMBER
and L_ORDERKEY
fields to the
int
data
type:SELECT L_LINESTATUS, L_QUANTITY FROM lineitem WHERE CAST(L_LINENUMBER as int) = 1 AND CAST(L_ORDERKEY as int) = 550;
L_LINENUMBER
and
L_ORDERKEY
fields and indicate the use of the CAST function and data type
for each field, as
follows:maprcli table index add -path /drill/testdata/qa/sf1/maprdb/json/lineitem -index l_cast_comp_1 \
-indexedfields '$CAST(L_LINENUMBER@INT)','$CAST(L_ORDERKEY@INT)' \
-includedfields L_LINESTATUS,L_QUANTITY
L_LINENUMBER
and
L_ORDERKEY
fields as the int
data type. HPE Ezmeral Data Fabric Database can use the index for any subsequent queries that use the
CAST function to retrieve these fields as the int
type, instead of
accessing data in the primary table and converting the values to
int
.L_LINENUMBER
and L_ORDERKEY
fields without the CAST function, the query used in this
example would not benefit from the index.