Data Types and Secondary Index Fields
Secondary indexes support a specific set of data types. This section describes how indexed and included fields in secondary indexes behave for various categories of data types.
Data Types of Indexed Fields
Prior to data-fabric 6.1, the indexed fields in a secondary index had to contain scalar data. For each scalar data value, HPE Ezmeral Data Fabric Database stored a row in the index. See the table in the Scalar Data section of JSON Document Data Types for a list of scalar types.
Beginning with data-fabric 6.1, indexed fields can also be nested documents or arrays, but not array elements. As with scalar data values, HPE Ezmeral Data Fabric Database stores a row in the index for each nested document and array. The index improves equality filters on the entire nested document or array.
data-fabric 6.1 also supports using container field paths as indexed fields.
The following table summarizes what HPE Ezmeral Data Fabric Database supports, depending on the characteristics of the indexed field:
Characteristics of Indexed Field | Pre-6.1 Behavior | 6.1 Behavior |
---|---|---|
Field contains scalar data | Supported | Supported |
Field contains nested document data | Not supported | Supported |
Field contains array data | Not supported | Supported |
Field path is a nested document subfield | Supported only if the subfield contains scalar data | Supported for any data type |
Field is an individual array element | Not supported | Not supported |
Field uses a container field path | Not applicable | Supported |
To understand what HPE Ezmeral Data Fabric Database stores for an indexed field defined on different data types, consider an example in which you have the following documents:
{"_id":"0", "field":0} {"_id":"1", "field":[0,1,2]} {"_id":"2", "field":{"subField":1}} {"_id":"3", "field":{"subField":[1,2,3]}} {"_id":"4", "field":[{"subField":1},{"subField":2}]} {"_id":"5", "field":[{"subField":[1,2,3]},{"subField":[4, 5]}]}
The
following table shows what an index defined on field
stores and an OJAI
query condition that matches the value stored in the index:
Document ID | Value Stored in Index Defined on
field |
Matching OJAI Query Condition |
---|---|---|
0 |
|
{"$lt":{"field":1}} |
1 |
[0,1,2] |
{"$eq":{"field":[0,1,2]}} |
2 |
{"subField":1} |
{"$eq":{"field":{"subField":1}}} |
3 |
{"subField":[1,2,3]} |
{"$eq":{"field":{"subField":[1,2,3]}}} |
4 |
[{"subField":1},{"subField":2}] |
{"$eq":{"field":[{"subField":1},{"subField":2}]}} |
5 |
[{"subField":[1,2,3]},{"subField":[4, 5]}] |
{"$eq":{"field":[{"subField":[1,2,3]},{"subField":[4,5]}]}} |
The following table shows what an index defined on
field.subField
stores and an OJAI query condition that matches the
value stored in the index:
Document ID | Value Stored in Index Defined on
field.subField |
Matching OJAI Query Condition |
---|---|---|
0 | Missing 1 | N/A |
1 | Missing 1 | N/A |
2 |
|
{"$lt":{"field.subField":5}} |
3 |
|
{"$eq":{"field.subField":[1,2,3]}} |
4 | Missing 2 | N/A |
5 | Missing 2 | N/A |
1 The index entry for documents 0 and 1 are missing because
field
is not a nested document in these documents.
2 The index entries for documents 3 and 4 are missing because
field
is an array in those documents.
These indexes enable HPE Ezmeral Data Fabric Database to quickly look up values stored in the index. As shown in the table, these values can be scalars, arrays, or nested documents. In the case of the latter two types, HPE Ezmeral Data Fabric Database can only use the index for equality conditions.
Data Types of Included Fields
There are no type restrictions on the included fields in an index.
Using Container Field Paths as Indexed Fields
When an indexed field is not a container field path, the index contains one row per field value.
For example, suppose you have the same set of documents shown earlier:
{"_id":"0", "field":0} {"_id":"1", "field":[0,1,2]} {"_id":"2", "field":{"subField":1}} {"_id":"3", "field":{"subField":[1,2,3]}} {"_id":"4", "field":[{"subField":1},{"subField":2}]} {"_id":"5", "field":[{"subField":[1,2,3]},{"subField":[4, 5]}]}
field[]
field[].subField
field.subField[]
field[].subField[]
Each entry in the table represents a row in the index.
Document ID | Indexed Field Path | |||
---|---|---|---|---|
field[] |
field[].subField |
field.subField[] |
field[].subField[] |
|
0 |
|
Missing 1 | Missing 1 | Missing 1 |
1 |
|
Missing 1 | Missing 1 | Missing 1 |
|
||||
|
||||
2 |
{"subField":1} |
|
|
|
3 |
{"subField":[1,2,3]} |
|
|
|
|
|
|||
|
|
|||
4 |
{"subField":1} |
|
Missing 2 |
|
{"subField":2} |
|
|
||
5 |
{"subField":[1,2,3]} |
|
Missing 2 |
|
|
||||
{"subField":[4, 5]} |
|
|
||
|
||||
|
1 The index entries for documents 0 and 1 are missing in all indexes except
the index on field[]
because field
is not a nested
document.
2 The index entries for documents 3 and 4 are missing in the index on
field.subField[]
because field
is an array in those
documents.
To use these indexes, your query condition must use container field paths that correspond to the indexed fields. The following are sample OJAI query conditions that you might use with each index:
Indexed Field Path | Sample OJAI Query Condition | Matching Document(s) |
---|---|---|
field[] |
{"$eq":{"field[]":0} |
0, 1 |
{"$eq":{"field[]":{"subField":[1,2,3]}}} |
3, 5 | |
field[].subField |
{"$eq":{"field[].subField":1}} |
2, 4 |
field.subField[] |
{"$gt":{"field.subField[]":2}} |
3 |
field[].subField[] |
{"$eq":{"field[].subField[]":2}} |
3, 4, 5 |
See OJAI Query Conditions Using Container Field Paths for further details about how these types of conditions behave.
As shown in these examples, defining an index on a container field path is different from
defining an index on an entire array field. For example, an index on
field[]
can filter on individual array elements, whereas the index on
field
can filter only the entire value. Similarly, defining an index on
field[].subField[]
provides the most generality. It allows you to
filter on any elements in subField
, regardless of the data types in
both
field
and subField
. However, you also incur the overhead
of storing more data in your index and the performance impact of updating the
index.
With a container field path, you may need to add included fields in your index to make the index covering. See Covering Indexes and Container Field Paths for details.
There are also limitations in the composite indexes you can define. See Composite Indexes and Container Field Paths for details.
Comparisons and Sorts on Indexed Fields
Comparisons and sorts across data types differ depending on whether the types are comparable or noncomparable. This is not specific to secondary indexes. However, it impacts comparisons when using secondary indexes and the order HPE Ezmeral Data Fabric Database stores data in an index. See Using Comparable JSON Document Data Types in Comparisons and Sorts and Using Non-comparable JSON Document Data Types in Comparisons and Sorts to learn which types fall into each category and to understand their behavior.