Optimizing Queries with Indexes
HPE Ezmeral Data Fabric Database provides a highly scalable key-value database platform on which you can run SQL queries using Drill. As a part of the 6.0 release, HPE Ezmeral Data Fabric Database natively supports indexes on secondary fields in JSON tables.
_id
field (unique key field). By default, HPE Ezmeral Data Fabric Database sorts the JSON table
by the _id
field. All other fields in the JSON table are secondary fields.
You can create indexes on the secondary fields in a JSON table to eliminate full tables scans
and significanlty improve query performance. See HPE Ezmeral Data Fabric Database as a Document
Database and Secondary Index
Concepts for more information. Benefits of Indexes
Well-designed indexes can optimize access to data stored in HPE Ezmeral Data Fabric Database JSON tables and improve performance for high read operations, fast integrated analytics, and complex operational analytics. See Secondary Indexes for more information about the benefits of indexes.
Types of Queries that Benefit from Indexes
Query Type | Description |
Equality | Equality queries contain equality conditions, such as a=1 and can also include IN. See Equality Queries. |
Range | Range queries contain range conditions, such as <=, >=, and the
LIKE pattern matching condition. See Range
Queries. NOTE The LIKE operator only works on fields that have varchar
data types. To use the LIKE operator in queries, use the CAST function to explicitly cast fields to varchar. To
use indexes for such queries, create indexes on the cast expressions, as
explained in Using Casts in Secondary Indexes. |
ORDER BY | ORDER BY queries specify a sort order. If the ordering and sorting of the index key list match the ordering specified in a query, the optimizer in Drill does not have to sort the data after the index scan. See ORDER BY Queries. |
Multi-index | Multi-index queries contain conditions on multiple fields. Drill can scan multiple indexes and use the intersection of the matching documents to optimize these queries. Multi-index queries are an alternative to using composite key indexes. See Multi-Index Queries. |
Drill can create index plans for queries with and without filters in the WHERE clause. For example, Drill can create an index plan for an ORDER BY query that does not have filters.
- GROUP BY
- JOIN
- DISTINCT
Types of Indexes Supported by HPE Ezmeral Data Fabric Database
Index Type | Description |
Simple | Simple indexes are indexes with a single indexed field (or key). See Simple Indexes. |
Composite | Composite indexes are indexes that have more than one indexed field (or key). See Composite Indexes. |
Hashed | Hashed indexes are indexes that distribute keys across logical partitions to avoid the creation of hot spots when HPE Ezmeral Data Fabric Database updates the index with new keys from the JSON table. See Hashed Indexes. |
Covering | A covering index is an index that allows HPE Ezmeral Data Fabric Database to process a query using only the secondary indexes. HPE Ezmeral Data Fabric Database does not have to read data in the JSON table. See Covering Indexes. |
Indexes with the CAST function | Indexes with the CAST function convert the indexed field to the data type specified by the CAST function and store the results. See Using Casts in Secondary Indexes. |
Steps Required to Use Indexes
- Install the latest version of the required data-fabric software on the cluster. See Preparing Clusters for Querying using Secondary Indexes on JSON Tables and Installing Drill.
- Evaluate your queries and design indexes that support the queries. See Understanding the Secondary Index Workflow and Designing Secondary Indexes.
- Create indexes on JSON tables in HPE Ezmeral Data Fabric Database. See Adding Secondary
Indexes on JSON Tables and Managing
Secondary Indexes. NOTEThe user that creates indexes on a JSON table must have created the table or have the
indexperm
permission in addition toreadAce
on the volume andlookupdir
on directories in the table path. If you do not have these permissions, consult with your system administrator. - Issue queries.
- Verify that Drill uses the available indexes. See Determining Index Use and Troubleshooting Indexes.
Additional Information
- To see how Drill selects a query plan, see Selection and Execution of Secondary Indexes.
- To learn about the index planning and execution configuration options available in Drill, see Index Planning and Execution Configuration Options.
- For information about index architecture, see Implementation of Secondary Indexes.