HBase
Apache HBase™ is the Hadoop database, a distributed, scalable, big data store. You can use Apache HBase when you need random, realtime read-write access to your Big Data. This section describes how to use HBase with the HPE Ezmeral Data Fabric, but does not duplicate Apache documentation.
The goal of Apache HBase is to host very large tables – billions of rows with millions of columns – atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and Hadoop-compatible filesystems, such as the file system.
Installing Apache HBase on a Data Fabric cluster involves
storing all HBase components in a single volume mapped to directory /hbase
in
the cluster. Tables are stored in a flat namespace, not grouped logically with related files.
Because all Apache HBase data resides in one volume, only one set of storage policies can be
applied to the entire Apache HBase datastore. Mirrors and snapshots of the HBase volume do not
provide functional replication of the datastore. Despite this limitation, mirrors can be used
to back up HLogs and HFiles in order to provide a recovery point for Apache HBase data.
This section documents how to work with HBase on the HPE Ezmeral Data Fabric. You can refer also to documentation available from the Apache HBase project.