About Release 7.7.0
This site contains documentation for HPE Ezmeral Data Fabric release 7.7.0, including installation, configuration, administration, and reference content, as well as content for the associated ecosystem components and drivers.
7.7.0 Installation
This section contains information about installing and upgrading HPE Ezmeral Data Fabric software. It also contains information about how to migrate data and applications from an Apache Hadoop cluster to a HPE Ezmeral Data Fabric cluster.
7.7.0 Data Fabric
HPE Ezmeral Data Fabric is the industry-leading data platform for AI and analytics that solves enterprise business needs.
7.7.0 Administration
This section describes how to manage the nodes and services that make up a cluster.
- Administering Users and Clusters
  Lists topics that help manage a data-fabric cluster.
- Administering Nodes
  Provides a synopsis of managing nodes in a cluster.
- Administering Volumes
  This section provide information about how to organize and manage data using volumes, a unique feature of HPE Ezmeral Data Fabric clusters.
- Administering Files and Directories
- Administering Tables
  Administration of the HPE Ezmeral Data Fabric Database is done primarily via the command line (maprcli) or with the Managed Control System (MCS). Regardless of whether the HPE Ezmeral Data Fabric Database table is used for binary files or JSON documents, the same types of commands are used with slightly different parameter options. HPE Ezmeral Data Fabric Database administration is associated with tables, columns and column families, and table regions.
- Administering Streams
- Administering Data Fabric Gateways
  A HPE Ezmeral Data Fabric gateway mediates one-way communication between a source HPE Ezmeral Data Fabric cluster and a destination cluster. You can replicate HPE Ezmeral Data Fabric Database tables (binary and JSON) and HPE Ezmeral Data Fabric Streams streams. HPE Ezmeral Data Fabric gateways also apply updates from JSON tables to their secondary indexes and propagate Change Data Capture (CDC) logs.
- Administering Services
  - Managing Services
    Synopsis on managing services.
  - Viewing CLDB Information
    Describes how to view CLDB information from the CLDB page, and provides an explanation of each field that the page displays.
  - Listing CLDB Nodes
    Describes how to list CLDB nodes in the HPE Ezmeral Data Fabric.
  - Managing Drill
    Provides a short description on managing Drill services.
  - Managing the HPE Ezmeral Data Fabric NFS Service
    Provides an overview of managing the NFS for theHPE Ezmeral Data Fabric service on a licensed cluster.
  - Managing HPE Ezmeral Data Fabric POSIX Clients
    Provides a brief synopsis of HPE Ezmeral Data Fabric POSIX clients.
  - Managing the MAST Gateway
  - Configuring YARN for Control Groups
    Control groups (cgroups) are a Linux kernel feature available through the LinuxContainerExecutor program that you can configure to limit and monitor the CPU resources available to YARN container processes on a node.
  - Configuring NodeManager Restart
  - Managing Jobs and Applications
    - Job Scheduling
    - Submitting Jobs and Applications to the Cluster
    - Configuration Files for Jobs and Applications
      Lists the locations of the MapReduce configuration files.
    - YARN Container Resources
      Provides an overview of YARN.
- Monitoring the Cluster
  This section describes how to monitor the health and performance of a MapR cluster.
- Configuring Security
  Describes how to configure security and manage secure clusters.
- Managing Secure Clusters
  Provides procedures that will enable you to use Data Fabric clusters securely.
- Administering the Data Access Gateway
  The HPE Ezmeral Data Fabric Data Access Gateway is a service that acts as a proxy and gateway for translating requests between lightweight client applications and the HPE Ezmeral Data Fabric cluster. This section describes considerations when upgrading the service, how to modify configuration settings, and how to administer and manage the service.
- Planning for High Availability
- Administrator's Reference
  This section contains in-depth reference information for the administrator.
- Troubleshooting Cluster Administration
  Lists the common errors and their solutions.
- Best Practices for Backing Up HPE Ezmeral Data Fabric Information
  Lists the best practices and performance considerations to follow when backing up HPE Ezmeral Data Fabric information.
- IPv6 Support in Data Fabric
  Describes the IPv6 support feature for Data Fabric.
7.7.0 Development
This section contains information related to application development for Ezmeral ecosystem components and HPE Ezmeral Data Fabric products, including the file system, Database (Key-Value and JSON), and Event Streams.
Other Docs
This section contains release-independent information, including: Installer documentation, Ecosystem release notes, interoperability matrices, security vulnerabilities, and links to other data-fabric version documentation.
Glossary
Definitions for commonly used terms in MapR Converged Data Platform environments.

Job Scheduling

You can use job scheduling to prioritize the YARN applications that run on your MapR cluster.

The MapReduce system supports a minimum of one queue, named default. Hence, this parameter's value should always contain the string default. Some job schedulers, like the Capacity Scheduler, support multiple queues.

The default job scheduler is the Fair Scheduler, which is designed for a production environment with multiple users or groups that compete for cluster resources.

The MapR Converged Data Platform supports these job schedulers:

FIFO queue-based scheduler: The FIFO queue scheduler runs jobs based on the order in which the jobs were submitted. You can prioritize a job by changing the value of the mapred.job.priority property or by calling the setJobPriority() method.
Fair Scheduler: This is the default scheduler. The Fair Scheduler allocates a share of cluster capacity to each user over time. The design goal of the Fair Scheduler is to assign resources to jobs so that each job receives an equal share of resources over time. The Fair Scheduler enforces fair sharing within each queue. Running jobs share the queue's resources.
Capacity Scheduler: The Capacity Scheduler enables users or organizations to simulate an individual hadoop cluster with FIFO scheduling for each user or organization. You can define organizations using queues.

The following sections provide more information about job scheduling:

Partners Support Dev-Hub Community Training ALA Privacy Policy Glossary

HPE Ezmeral Data Fabric – Customer-Managed 7.7.0 Documentation
Abstract	This site contains documentation for the customer-managed platform of the HPE Ezmeral Data Fabric version 7.7.0 including installation, configuration, administration, and reference content, as well as content for the associated bundled ecosystem components and drivers.
Published	July 2024
Edition	7.7.0