table index add
This topic describes how to add secondary indexes on HPE Ezmeral Data Fabric Database JSON tables.
Permissions Required
To run this command, your user ID must have the following permissions:
readAce
on the volumelookupdir
on directories in the table pathindexperm
permission on the table
If you created the table in version 6.0 or later, you automatically have
indexperm
permission. For tables created before 6.0, even if you are the
owner of the table, you must explicitly add indexperm
permission.
Syntax
- CLI
-
maprcli table index add -path <path> -index <index name> -indexedfields < indexed field names > [ -includedfields < included field names > ] [ -hashed [ enable hashed index: true | false> ] [ -numhashpartitions < number of hash index partitions when hashed index is enabled > ]
- REST
-
curl -X POST \ 'http[s]://<host>:<port>/rest/table/index/add?path=<path>&index=<index name>&indexedfields=<indexed field names>&<parameters>' \ -u <username>:<password>
Parameters
Parameter | Description |
---|---|
path |
(Required) Path to where the parent JSON table resides. |
index |
(Required) Name of the index. |
indexedfields | (Required) Names of the indexed fields. This is a comma separated list of the
fields from the JSON table that are indexed and used for ordering the index. The
sort ordering of each field can be specified separately. The syntax is as
follows:
IMPORTANT Do not place a space between the commas and the field
names.A sort_order of A
sort _order of The following example
specifies two indexed fields. If
an indexed field contains a colon (:) in the name, you need to escape the last colon
in the name. In the example below, the indexed field names are the following:fieldName1 has an ascending sort,
while fieldName2 is
descending.
The following shows how to escape the last colon in the third indexed field.
NOTE The
CAST function can be applied on indexed fields. You must enclose each CAST
function call in single quotes. See the next section for details. |
includedfields | (Optional) Names of the included fields. This is a comma separated list of the
fields from the JSON table that are part of the index, but not used for ordering.
The syntax is as
follows:
IMPORTANT Do not place a space between the commas and the included field
names. |
hashed | (Optional) True | False. Default: false |
numhashpartitions | (Optional) Number of hash index partitions when the hashed index option is enabled. This parameter determines the number of logical partitions HPE Ezmeral Data Fabric Database distributes keys across. Incoming keys are hashed to 2 byte partition IDs. Default: 10 |
Applying CAST on Indexed Fields
Indexes can be defined with the CAST function applied to an indexed field.
The following statement queries a table named lineitem
and casts the
L_LINENUMBER
and L_ORDERKEY
fields to the
int
data type.
SELECT L_LINESTATUS, L_QUANTITY FROM lineitem WHERE CAST(L_LINENUMBER as int) = 1 AND CAST(L_ORDERKEY as int) = 550;
To optimize the previous statement, you can create an index on the L_LINENUMBER
and L_ORDERKEY
fields, and use the CAST function to map each field to a
specific data type, as shown below:
maprcli table index add \
-path /drill/testdata/qa/sf1/maprdb/json/lineitem \
-index l_cast_comp_1 -indexedfields '$CAST(L_LINENUMBER@INT)','$CAST(L_ORDERKEY@INT)' \
-includedfields L_LINESTATUS,L_QUANTITY
The index stores the values of the L_LINENUMBER
and L_ORDERKEY
fields as the int
data type. HPE Ezmeral Data Fabric Database can use the index for any subsequent
queries that cast these fields to int
instead of accessing data in the
primary table and converting the values to int
.
See Using Casts in Secondary Indexes for more information.
Restrictions
The following restrictions apply to creating indexes.
- Name Restrictions
- You cannot use the following characters in the index name and in the indexed fields:
To use the following characters in the index name and in the indexed fields, enclose them either in single or double quotes:< > ? % \
For example:; | ( ) /
maprcli table index add -path /volume1/MYTABLE -index "MYTABLE1_ANALYSIS_1設備^=#;{}&()/" \ -indexedfields "日時_timestamp":desc,"設備タイプ","LOTNo" -includedfields \ "データタイプ","達成度^=#;{}&()/" (or) maprcli table index add -path /volume1/MYTABLE -index 'MYTABLE1_ANALYSIS_1設備^=#;{}&()/' \ -indexedfields "日時_timestamp":desc,"設備タイプ","LOTNo" -includedfields \ 'データタイプ','達成度^=#;{}&()/'
To use either the ' or the " character in the index name and in the indexed fields, enclose:For example:- the ' character within double quotes (")
- the " character within single quote (')
maprcli table index add -path /volume1/MYTABLE -index "'MYTABLE1_ANALYSIS_1設備^=#;{}&()/" \ -indexedfields "日時_timestamp":desc,"設備タ'イプ","LOTNo" -includedfields \ "データタイプ'","達成度^=#;{}&()/" (or) maprcli table index add -path /volume1/MYTABLE -index '"MYTABLE1_ANALYSIS_1設備^=#;{}&()/' \ -indexedfields "'日時_timestamp":desc,"設備タイプ","LOTNo" -includedfields \ 'データタイプ"',"達成度^=#;{}&()/"
- Type Restrictions
-
If a composite index includes the same subfield in multiple indexed fields, the implied types of the subfields must be consistent.
For example, you cannot create an index with the following indexed fields:
a.b[].c, a.b.d
Although subfield
See Composite Indexes and Container Field Paths for more details.b
appears in both indexed fields, in the first, it is an array and in the second, it is a nested document.
- Size Restrictions
-
The maximum size of all indexed fields in an index is 32 KB.
If the collective size exceeds 32 KB, then an insert of the corresponding document results in an encoding error (INDEX_ROW_KEY_ENCODER_ERROR_ENCODING_IS_TOO_LONG
).The maximum number of indexes that you can create on a JSON table is 32.
- Field Definition Restrictions
-
You cannot specify individual array elements as indexed fields.
-
You cannot specify a table's
_id
field as an indexed field. -
If a field contains an array of nested documents and you want to index on subfields in the nested documents, then you must define the indexed field using a container field path.
You can include a specific field only once as either an indexed or included field, with the following two exceptions:
The indexed field is a container field path:
maprcli table index add -path /people \ -index phoneNumberIdx \ -indexedfields Phones[].Number \ -includedfields Phones[].Number
-
The field specifies a cast to another type.
You can create an index in which the
score
field is an indexed field cast as adouble
type, andscore
is also an included field. The included field retains the original data type of thescore
field:maprcli table index add -path /castTable \ -index castIdx1 \ -indexedfields '$CAST(score@DOUBLE)' \ -includedFields score
You can create an index in which the
score
field is an indexed field, cast as adouble
type, and thescore
field is also another indexed field, cast as along
type:maprcli table index add -path /castTable \ -index castIdx2 \ -indexedfields '$CAST(score@DOUBLE)','$CAST(score@LONG)'
-
You cannot use casts with included fields.
You cannot specify a field as either an indexed or included field if the field is also specified as a column family JSON path name.
For example, suppose you have the following JSON table:{ "_id" : "ID", "a" : { "b" : { "c" : "value", "d" : "value" }, "e" : "value" } }
If you create a column family at fieldc
in the JSON patha.b.c
, you cannot define fielda.b.c
as either an indexed or included field. You can define the fieldsa
,a.b
, anda.b.d
as either indexed or included fields.You cannot specify an included field in which the data in the field spans more than one column family.
In the following example, the included fieldsl1.sl2
spans column families,cf2
andcf3
:maprcli table cf list -path /cftab compressionperm readperm traverseperm jsonfamilypath writeperm minversions maxversions compression ttl inmemory cfname memoryperm u:root u:root u:root u:root 0 1 lz4 2147483647 false default u:root u:root u:root u:root sl1 u:root 0 1 lz4 2147483647 false cf1 u:root u:root u:root u:root sl1.sl2.sl3 u:root 0 1 lz4 2147483647 false cf2 u:root u:root u:root u:root sl1.sl2.sl3.sl4 u:root 0 1 lz4 2147483647 false cf3 u:root maprcli table index add -path /cftab -index i1 -indexedfields sl1.sl2.sl3.sl4.l4a, sl1.l1a -includedfields sl1.sl2,sl1.sl2.sl3.sl4.sl5.l5b -json { "timestamp":1507419777919, "timeofday":"2017-10-07 04:42:57.919 GMT-0700 PM", "status":"ERROR", "errors":[ { "id":22, "desc":"Data for included field sl1.sl2 may not span more than one column family." } ] }
You cannot specify a composite index with more than one container field path as your indexed fields, unless the prefixes of the container field paths are the same.
See Composite Indexes and Container Field Paths for more details.You cannot specify a composite index with an indexed field that is a subfield of another indexed field.
For example, you cannot create an index with the following indexed fields:a, a.b
The indexed field
a.b
is a subfield of the indexed fielda
.
- Option Restrictions
-
As indexes are automatically split, you cannot disable splits when you create your index.
- Index Use Restrictions
-
Indexes do not optimize non-existence filter conditions.
Example
- CLI
-
maprcli table index add -path /demo/business -index newIndex -indexedfields fieldName
- REST
-
curl -k -X POST \ 'https://r1n1.sj.us:8443/rest/table/index/add?path=%2Fdemo%2Fbusiness&index=newIndex&indexedfields=fieldName' \ -u mapr:mapr