Preparing Clusters for Querying using Secondary Indexes on JSON Tables

Describes the tasks needed to prepare your environment so you can query HPE Ezmeral Data Fabric Database JSON tables using secondary indexes.

Installing with the Data Fabric Installer

To install Data Fabric using the Data Fabric installer, follow the steps outlined at Installing with the Installer.

Starting with release 6.0.1, you do not have to enable a separate query service to use secondary indexes. The Operational Applications with HPE Ezmeral Data Fabric Database template installs and configures the replication gateways needed to update secondary indexes in HPE Ezmeral Data Fabric Database JSON and includes the components needed to run OJAI queries.

You must enable the OJAI Distributed Query Service to use certain features. The following table summarizes the differences in the functionality of OJAI queries when you do and do not have the service enabled:

Service Not Enabled Service Enabled
  • Can run queries that use a single secondary index
  • Can sort data in your queries up to a configurable limit
  • Can run queries that use multiple secondary indexes
  • Can sort data in your queries without any limit
  • Can run queries in parallel

Selecting any of the following templates enables the OJAI Distributed Query Service:

  • Operational Applications with HPE Ezmeral Data Fabric Database and Distributed Query Service
  • Data Fabric Cluster: Batch, interactive and real-time analytics
  • Analytics with HPE Ezmeral Data Fabric Database

You can also explicitly enable the OJAI Distributed Query Service by selecting the service in the Custom Services template.

For more information about installer templates, see Auto-Provisioning Templates.

For more information about how secondary index selection and execution works in HPE Ezmeral Data Fabric Database JSON, see Selection and Execution of Secondary Indexes.

NOTE
The OJAI Query Service has been renamed to the OJAI Distributed Query Service in release 6.0.1. All information about the OJAI Distributed Query Service applies to the OJAI Query Service, except where noted.

Installing without the Data Fabric Installer

Other sections of the documentation describe the detailed steps for installing and configuring without the Data Fabric installer. Generally, you need to perform the following steps:

  1. Install software.

    To install Data Fabric without using the Data Fabric installer, follow the steps outlined at Installing without the Installer. In addition to installing Data Fabric core packages, you also need to install Drill if you want advanced secondary index selection, sorts on large data sets, and parallel query execution. When installing Drill, make sure to Configure the OJAI Distributed Query Service.

  2. Install and configure replication gateways.

    Updates are propagated from the JSON tables using the Gateways for Replicating HPE Ezmeral Data Fabric Database Tables. You need to install the replication gateways. Since the source JSON table and the secondary index are on the same volume within a cluster, configure an intracluster gateway. In this type of gateway, the source and destination clusters are the same.

    If your gateways are running on the same nodes as CLDB, then no additional configuration steps are required. See Configuring Gateways for Table and Stream Replication for details about this scenario and other options for configuring your gateways.

Upgrades

Other sections of the documentation describe the detailed steps for upgrades. Generally, you need to perform the following steps:

  1. Upgrade your Data Fabric software by following the instructions at Upgrading Core or EEP Components.
  2. Install Drill, if you have not already done so and want to sort large data sets and run queries in parallel.
  3. When installing Drill, make sure to Configure the OJAI Distributed Query Service.
  4. If you are upgrading without the Data Fabric installer, follow step 2 in the previous section to install and configure replication gateways.
  5. Enable the replication support needed to propagate index updates by running the following command:
    maprcli cluster feature enable -name mfs.feature.db.streams.v6.support

    See Step 4: Enable New Features for further details.

IMPORTANT
If you are using a Manual Rolling Upgrade Description, you must upgrade all nodes running replication gateways before performing updates on tables with indexes. Otherwise, the index updates will hang.