Architecture
HPE Ezmeral Data Fabric Database is an enterprise-grade, high performance, NoSQL (“Not Only SQL”) database management system. You can use it to add realtime, operational analytics capabilities to big data applications. As a multi-model NoSQL database, it supports both JSON document models and key-value data models.
Why use HPE Ezmeral Data Fabric Database?
- Integrated analytics with SQL: HPE Ezmeral Data Fabric Database's integration with Drill for Data Fabric provides a low latency, distributed, SQL query engine for large-scale datasets, including structured and semi-structured, nested data.
- Operational analytics: HPE Ezmeral Data Fabric Database can run in the same cluster as Apache™ Hadoop® and Apache Spark, letting you immediately analyze or process live, interactive data. This also enables you to eliminate data silos to speed the data-to-action cycle, providing a more efficient data architecture.
- Global distribution of applications: Application access to HPE Ezmeral Data Fabric Database tables is distributable on a global scale.
- Flexible data model: You can use HPE Ezmeral Data Fabric Database as both a document database and a column-oriented database. As a document database, HPE Ezmeral Data Fabric Database stores JSON documents in JSON tables. As a column-oriented database, it stores binary files in binary tables.
How is HPE Ezmeral Data Fabric Database Related to HPE Ezmeral Data Fabric File Store?
HPE Ezmeral Data Fabric Database implements tables within the framework of the Data Fabric filesystem. HPE Ezmeral Data Fabric Database creates tables (both binary and JSON tables) in logical units called volumes.
What are HPE Ezmeral Data Fabric Database's Architectural Advantages?
HPE Ezmeral Data Fabric Database's architecture has the following advantages:
-
It reduces process overhead because it has no extra layers to pass through when performing operations on data.
HPE Ezmeral Data Fabric Database, like several other NoSQL databases, is a log-based database. HPE Ezmeral Data Fabric Database runs inside of the Data Fabric filesystem process, which enables it to read from and write to disks directly. In contrast, other NoSQL databases must communicate with a separate process to performs disk reads and writes. The approach taken by HPE Ezmeral Data Fabric Database eliminates extra process hops, duplicate caching, and needless abstractions, with the consequence of optimizing I/O operations on your data.
-
It minimizes compaction delays because it avoids I/O storms when it merges logged operations with structures on disk.
As a log-based database, HPE Ezmeral Data Fabric Database must write logged operations to disk. HPE Ezmeral Data Fabric Database stores table regions (also called tablets) and smaller structures within them partially as b-trees. Together with write-ahead logs (WAL), these b-trees comprise log-structured-merge trees. Write-ahead logs for the smaller structures within regions are periodically restructured by rolling merge operations on the b-trees. As HPE Ezmeral Data Fabric Database performs these merges at small scales, applications running against HPE Ezmeral Data Fabric Database see no significant effects on latency while the merges are taking place.NOTEApache HBase also uses the term regions.
What Design Factors are Important when Using HPE Ezmeral Data Fabric Database?
- Rowkey Optimization: The design of a table's rowkeys affects the speed at which client applications can access data. It also impacts database performance if hotspotting occurs. The better the design, the faster the data access. See Table Rowkey Design for more information.
- Column Family Optimization: Column families enable you to group related sets of data and restrict queries to a defined subset, leading to better performance. When you design a column family, think about what kinds of queries you are going to use most often, and group your columns accordingly. See Column Families in JSON Tables and Column Families in Binary Tables for more information.
- Replication Implementation: The design of table replication (in addition to the automatic replication that occurs with table regions within a volume) depends on your desired outcome and the complexity of your environment. See Table Replication for more information.
- Security Implementation: You can implement security at various levels including for table replication, JSON documents, and general access. Determining what level and where is part of the architectural design. See Security on JSON Tables, and Security.