About Release 7.9.0
This site contains documentation for HPE Ezmeral Data Fabric release 7.9.0, including installation, configuration, administration, and reference content, as well as content for the associated ecosystem components and drivers.
7.9.0 Installation
This section contains information about installing HPE Ezmeral Data Fabric software. It also contains information about how to migrate data and applications from an Apache Hadoop cluster to a HPE Ezmeral Data Fabric cluster.
7.9.0 Data Fabric
HPE Ezmeral Data Fabric is the industry-leading data platform for AI and analytics that solves enterprise business needs.
7.9.0 Administration
This section describes how to manage the nodes and services that make up a cluster.
7.9.0 Development
This section contains information related to application development for Ezmeral ecosystem components and HPE Ezmeral Data Fabric products, including the file system, Database (Key-Value and JSON), and Event Streams.
- Application Development Process
  Before you start developing applications on the HPE Ezmeral Data Fabric platform, consider how you will get the data into the platform, the storage format of the data, the type of processing or modeling that is required, and how the data will be accessed.
- File Store and Apps
  The following sections provide information about accessing the File Store with C and Java applications.
- HPE Ezmeral Data Fabric Database and Apps
  This section contains information about developing client applications for JSON and key-value tables.
  - Installing the mapr-client Package
    The mapr-client package must be installed on each node where you will be building and running your applications. This package installs all of the MapR Libraries needed for application development regardless of programming language or type of HPE Ezmeral Data Fabric Database table (binary or JSON).
  - Passing the HPE Ezmeral Data Fabric Database Table Path
    This topic describes the methods for passing a HPE Ezmeral Data Fabric Database table name. Binary table names can be passed by either specifying the table path in the API or by setting the table path in the core-site.xml file. JSON table names are passed by specifying the table path in the API.
  - Tuning Parameters for Client Apps
    Though tuning client applications is generally not necessary, Data Fabric does offer tuning parameters to change the behavior of client-side caching.
  - Developing Applications for Binary Tables
    HPE Ezmeral Data Fabric Database provides a C API, libMapRClient and partially supports the Apache HBase 1.1 Java APIs for performing operation on HPE Ezmeral Data Fabric Database binary tables.
    - Creating C Apps - Binary Tables
      Data Fabric provides a library of C APIs – libMapRClient – for performing operations on HPE Ezmeral Data Fabric Database binary tables.
    - Creating Java Apps - Binary Tables
      This topics describes the supported Apache HBase Java APIs used for CRUD operations on HPE Ezmeral Data Fabric Database binary tables.
      - Compiling and Running HPE Ezmeral Data Fabric Database Binary Applications
      - HBase Java API Support
        This topic describes the methods in the Apache HBase Java API library that are supported for HPE Ezmeral Data Fabric Database tables.
        Admin Method Support
        This topic lists the methods that the HPE Ezmeral Data Fabric Database supports in the HBase interface Admin.
        BufferedMutator Method Support
        This table indicates which methods HPE Ezmeral Data Fabric Database supports in the HBase interface BufferedMutator.
        Connection Method Support
        This table indicates which methods HPE Ezmeral Data Fabric Database supports in the HBase interface Connection.
        ConnectionFactory Method Support
        This table indicates which methods HPE Ezmeral Data Fabric Database supports in the HBase class ConnectionFactory.
        RegionLocator Method Support
        This table indicates which methods HPE Ezmeral Data Fabric Database supports in the HBase interface RegionLocator.
        Table Method Support
        This table indicates which methods HPE Ezmeral Data Fabric Database supports in the HBase interface Table.
        HColumnDescriptor and HTableDescriptor Support
        This section describes the supported fields in the HColumnDescriptor and the HTableDescriptor classes.
        Support for HBase Java Filters Support
        HBase Java Comparators Support
        Unsupported HBase Java Methods
        This topic identifies the HBase Java methods that are not supported for HPE Ezmeral Data Fabric Database tables. Attempts to call any of these methods results in an UnsupportedOperationException exception.
    - Impersonation through the HBase REST Gateway
      Impersonation enables access to tables via user IDs other than the user that runs the Gateway.
    - Mapping to HBase Table Namespaces
      This section describes mapping table namespaces between Apache HBase tables and HPE Ezmeral Data Fabric Database binary tables.
    - Thread-pool Settings for Performance
    - Building MapReduce Applications
      This section provides information about building and running custom MapReduce application that access HPE Ezmeral Data Fabric Database binary tables.
  - Setting for OJAI Applications to Use Data Fabric Client Features
    Describes how to set the classpath for OJAI applications to use Data Fabric client features.
  - Developing Applications for JSON Tables
    As part of its support for JSON tables, HPE Ezmeral Data Fabric Database implements the OJAI API. The OJAI API provides methods for creating, reading, updating, and deleting JSON documents in HPE Ezmeral Data Fabric Database JSON tables. It is available in Java, and starting in EEP 6.0, also available in Node.js, Python, C#, and Go. HPE Ezmeral Data Fabric Database also provides a HPE Ezmeral Data Fabric Database JSON Client API for managing JSON tables and a HPE Ezmeral Data Fabric Database JSON REST API for performing basic operations using HTTP calls.
- Apache Kafka Wire Protocol Service
  HPE Ezmeral Data Fabric Streams supports Apache Kafka Wire Protocol Service. Apache Kafka Wire Protocol Service is a TCP/IP service that emulates a Kafka cluster backed by HPE Ezmeral Data Fabric Streams. The service makes it possible for Apache Kafka clients written in any programming language to access topics in HPE Ezmeral Data Fabric Streams.
- HPE Ezmeral Data Fabric Streams and Apps
  HPE Ezmeral Data Fabric Streams brings integrated publish and subscribe messaging to HPE Ezmeral Data Fabric.
- MapReduce and Apps
  This section contains information associated with developing YARN applications.
- Kubernetes Interfaces for Data Fabric
  This section describes how to leverage the capabilities of the Kubernetes Interfaces for Data Fabric.
- Ecosystem Components
  The following sections provide information about each open-source project that is supported by the HPE Ezmeral Data Fabric.
- Maven and the HPE Ezmeral Data Fabric
  This section discusses topics associated with Maven and the HPE Ezmeral Data Fabric.
- Developer's Reference
  This section contains in-depth information for the developer.
- API Documentation
  HPE Ezmeral Data Fabric supports public APIs for file system, HPE Ezmeral Data Fabric Database, and HPE Ezmeral Data Fabric Streams. These APIs are available for application-development purposes.
Other Docs
This section contains release-independent information, including: Installer documentation, Ecosystem release notes, interoperability matrices, security vulnerabilities, and links to other Data Fabric version documentation.
Glossary
Definitions for commonly used terms in MapR Converged Data Platform environments.

HColumnDescriptor and HTableDescriptor Support

This section describes the supported fields in the HColumnDescriptor and the HTableDescriptor classes.

HPE Ezmeral Data Fabric Database supports all of the methods that are in these classes. However, it supports only a subset of their fields.

HColumnDescriptor Class

Field	Description
BLOCKSIZE	Size of blocks in files stored to the filesytem (hfiles).
BLOOMFILTER	Whether or not to use bloomfilters.
COMPRESSION	Compression type.
IN_MEMORY	Whether to serve from memory or not.
MIN_VERSIONS	Minimum number of versions to keep.
NAME	Name of the column family.
TTL	Time to live of cell contents.
VERSIONS	Number of versions to keep.

HTableDescriptor Class

Field	Description
AUTOSPLIT	Specifies whether to split the table into regions automatically as the table grows. The average size of each region is determined by the`regionsizemb` parameter. The default value is `true`.
BULKLOAD	Boolean. Specifies whether to perform a full bulk load of the table. The default is `false`. For more information, see Bulk Loading and Data Fabric Tables.
DELETE_TTL	Used for multi-master replication. Normally, delete operations are purged after the affected table cells are updated. Whereas the result of an update is saved in a table until another change overwrites or deletes it, the result of a delete is not saved. In multi-master replication, this difference can lead to tables being unsynchronized. Example Suppose that you have set up multi-master replication between table `customers` in the cluster `sanfrancisco` and table `customers` in the cluster `newyork`. Client applications then make these two changes: On `/mapr/sanfrancisco/customers`, put row A at 10:00:00 AM. On `/mapr/newyork/customers`, delete row A at 10:00:01 AM. On `/mapr/sanfrancisco/customers`, the order of operations is: Put row A with a timestamp of 10:00:00 AM Delete row A with a timestamp of 10:00:01 AM (This operation is repllicated from `/mapr/newyork/customers`.) On `/mapr/newyork/customers`, the order of operations is: Delete row A with a timestamp of 10:00:01 AM Put row A with a timestamp of 10:00:00 AM (This operation is replicated from `/mapr/sanfrancisco/customers`.) Now, though the put happened on `/mapr/sanfrancisco/customers` at 10:00:00 AM, the put reaches `/mapr/newyork/customers` several seconds after that. Suppose that the actual time that the put arrives at `/mapr/newyork/customers` is 10:00:03 AM. To ensure that both tables stay synchronized, `/mapr/newyork/customers` should preserve the delete until after the put is replicated. Then, the delete can be applied after the put. Therefore, the time-to-live for the delete should be at least long enough for the put to arrive at `/mapr/newyork/customers`. In this case, the time-to-live should be at least 3 seconds. In general, the time-to-live for deletes should be greater than the amount of time that it takes replicated operations to reach replicas. By default, the value is 24 hours. For example, suppose (to extend the scenario above) that you pause replication during weekdays and resume it on weekends. The put takes place on Monday morning `/mapr/sanfrancisco/customers` at 10:00:00 AM and the delete takes place at `/mapr/newyork/customers` at 10:00:01 AM. Replication does not resume until 12:00:00 AM Saturday morning. Given the volume of operations to be replicated and the potential for network problems, it is possible that these operations will not be replicated until Sunday. In this scenario, a value of 7 days for DELETE_TTL (7 multiplied by 24 hours) should provide sufficient margin.
NAME	Name of the table.

Partners Support Dev-Hub Community ALA Privacy Policy Glossary

HPE Ezmeral Data Fabric – Customer-Managed 7.9.0 Documentation
Abstract	This site contains documentation for the customer-managed platform of the HPE Ezmeral Data Fabric version 7.9.0 including installation, configuration, administration, and reference content, as well as content for the associated bundled ecosystem components and drivers.
Published	April 2025
Edition	7.9.0
Topic last updated	2024-06-25