Definitions
This article contains two sets of definitions:
- General: General terms used with HPE Ezmeral Runtime Enterprise. See General.
- HPE Ezmeral Data Fabric on Kubernetes: Terms used exclusively when discussing HPE Ezmeral Data Fabric in a Kubernetes environment. See HPE Ezmeral Data Fabric on Kubernetes.
General
These articles use the following terms (provided in alphabetical order):
- Active Directory (or AD): This is a Microsoft directory service for Windows domain networks.
- Arbiter: An Arbiter is a designated host that triggers the Shadow Controller host to assume the Controller role if the primary Controller host fails.
- Cluster: For Kubernetes, a cluster is a group of nodes (hosts) that each contain one or more pods.
- Big Data/AI application: A Big Data application generally refers to a distributed, multi-node, inter-related service that can process large amounts of data computing on several nodes. Some examples of Big Data and AI applications include Hadoop, Spark, Kafka, TensorFlow, H2O, and others. Big Data/AI applications should not be confused with microservices.
- cnode:cnode is the HPE Ezmeral Runtime Enterprise caching node service, which reduces latency when transferring storage I/O requests to and from the HPE Ezmeral Runtime Enterprise implementation of the HDFS Java client.
- Compute host (or Compute Worker) In Kubernetes deployments, a compute host or compute worker is a Kubernetes host that is managed by the Kubernetes control plane and is not used for HPE Ezmeral Data Fabric on Kubernetes storage.
- Container: A container is a lightweight, standalone, executable software package that runs specific services. An Open Container Initiative (OCI)-compliant container includes code, runtime, system libraries, configurations, and forth, that run as an isolated process in user space. An OCI-compliant container container is typically used to deploy scalable and repeatable microservices.
- Controller host: A Controller is a host that manages the HPE Ezmeral Runtime Enterprise deployment.
- DataTap: A DataTap is a shortcut that points to a storage resource on the network. A Tenant Administrator creates a DataTap within a tenant and defines the storage namespace that the DataTap represents (such as a directory tree in a file system). A Tenant Member may then access paths within that resource for data input and/or output. Creating and editing DataTaps allows Tenant Administrators to control which storage areas are available to the members of each tenant, including any specific sharing or isolation of data between tenants.
- Deployment: Another term for platform.
- Ephemeral storage: ephemeral storage is storage space available for backing the root file systems of hosts in the HPE Ezmeral Runtime Enterprise. Ephemeral storage is not persistent. Contrast with Tenant storage.
- Filesystem Mount (or FS Mount): A filesystem mount enables HPE Ezmeral Runtime Enterprise to automatically add NFS volumes or mounts to Kubernetes clusters. This enables Kubernetes clusters to directly access NFS shares as if they were local directories.
- Gateway host (or Gateway Worker): A Gateway host or Gateway Worker is a host that is managed by a Controller. Each Gateway host in HPE Ezmeral Runtime Enterprise maps services running on containers to ports in order to allow users to access those services
- HCP Agent: A custom Kubernetes controller that is installed on every Kubernetes cluster instantiated by HPE Ezmeral Runtime Enterprise. The agent performs key tasks, such as creating or associating namespaces to tenants, creating annotations for mapping NodePort services to Gateways, and creating FS mounts.
- Host: A host is either a physical server or a virtual server, located on your premises or in a public cloud, that is available to HPE Ezmeral Runtime Enterprise.
- HPE Ezmeral Runtime Enterprise:HPE Ezmeral Runtime Enterprise consists of the hosts that comprise the overall infrastructure available to create, run, and manage Kubernetes clusters.
- Kubeconfig: A file that configures access to Kubernetes when used in conjunction with either the kubectl command line tool or other clients.
- Kubectl: A command line tool for controlling a Kubernetes cluster.
- KubeDirector: An open source-project designed to simplify running complex stateful scale-out application clusters on Kubernetes. KubeDirector is built using the Kubernetes custom resource definition (CRD) framework and leverages the native Kubernetes API extensions and design philosophy. This enables transparent integration with Kubernetes user/resource management as well as existing clients and tools.
- Lightweight Directory Access Protocol (LDAP): This is a client-server directory service protocol that runs on a layer above the TCP/IP stack and provides a mechanism for connecting to, searching, and modifying networked directories.
- Master node: An outdated term for the Kubernetes control plane.
- Microservice: A microservice is a method of developing software applications as a suite of small, modular, and independently deployable services in which each service runs a unique process and communicates through a well-defined, lightweight mechanism to serve a business goal.
- Node: For Kubernetes, a node is a host that is a member of a Kubernetes cluster.
- Node storage: See Ephemeral Storage.
- Platform: A platform includes all of the tenants, projects, nodes, and users that exist on a given HPE Ezmeral Runtime Enterprise deployment. These articles may also use the term deployment to refer to "HPE Ezmeral Runtime Enterprise."
- Platform Administrator: The Platform Administrator (or Platform
Admin) is an HPE Ezmeral Runtime Enterprise user that has been
granted the role of
Site Admin
. A user with this role has the ability to create/delete tenants. This user will typically also be responsible for managing the hosts in the deployment. - Pod: For Kubernetes, a pod is a group of containers deployed on a single host.
- Project (or AI/ML Project): A project or AI/ML project is a unit of resource partitioning and data/user access control in a given deployment that is used for running AI/ML workloads in HPE Ezmeral ML Ops. The resources of an HPE Ezmeral Runtime Enterprise deployment are shared among the tenants AI/ML projects on that platform. All users who are a member of an AI/ML project can access the resources and data objects available to that project. This is analogous to a tenant, except that a tenant is not pre-configured for AI/ML workloads.
- Security Assertion Markup Language (SAML): This is an open standard for exchanging authentication and authorization data between parties, such as between an identity provider (IdP) and a service provider.
- Shadow Controller host: A Shadow Controller host is a host that assumes the Controller host role if the primary Controller host fails.
- Tenant: A tenant is a unit of resource partitioning and data/user access control in a given deployment. The resources of an HPE Ezmeral Runtime Enterprise deployment are shared among the tenants on that platform. All users who are a member of a tenant can access the resources and data objects available to that tenant. If a tenant is used to run HPE Ezmeral ML Ops, then is it called either a project or an AI/ML project.
- Tenant Administrator: A Tenant Administrator (or Tenant Admin) is a role granted to an HPE Ezmeral Runtime Enterprise user. A user with this role has the ability to manage the specific tenants for which they have been granted this role, including creating DataTaps for that tenant.
- Tenant Member: A Tenant Member (or Member) is a role granted to an HPE Ezmeral Runtime Enterprise user. A user with this role has non-administrative access to the specific tenants for which they have been granted this role. Members may use existing DataTaps for reading and writing data.
- Tenant storage:Tenant storage is a shared storage space that may
be provided by either a local HPE Ezmeral Data Fabric
installation within HPE Ezmeral Runtime Enterprise or a remote storage
service. Every tenant is assigned a sandbox area within this space that is
accessible by a special, non-editable TenantStorage DataTap. All virtual
nodes within the tenant can access this DataTap and use it for persisting data
that is not tied to the life cycle of a given cluster. Tenant storage differs
from other DataTap-accessible storage as follows:
- A tenant may not access tenant storage outside of its sandbox.
- The Platform Administrator can choose to impose a space quota on the sandbox.
- User: A user is the set of information associated with each person accessing the HPE Ezmeral Runtime Enterprise, including the authentication and site roles.
- Worker node: A Worker node is a container that is managed by a Master node in a cluster. For example, the Spark Worker is the worker node in a Spark virtual cluster. For Kubernetes, this is another term for Worker host. See node, above.
HPE Ezmeral Data Fabric on Kubernetes
The following terms are used when discussing HPE Ezmeral Data Fabric in a Kubernetes environment on HPE Ezmeral Runtime Enterprise. This list is intended to basic information to a user who is unfamiliar with HPE Ezmeral Data Fabric storage.
- HPE Ezmeral Data Fabric: A general purpose data store and file system that scales to support data-driven analytics, ML, and AI applications. HPE Ezmeral Data Fabric provides file store and NoSQL database (HBase API for binary and JSON) to move data in and out of the cloud, and provides event streams for streaming applications.
- HPE Ezmeral Data Fabric on Bare Metal: The name of the implementation of HPE Ezmeral Data Fabric on physical or virtual machines.
- HPE Ezmeral Data Fabric on Kubernetes: The name of the implementation of HPE Ezmeral Data Fabric in a Kubernetes cluster running in HPE Ezmeral Runtime Enterprise.
- Data Fabric: This is the short form of the term HPE Ezmeral Data Fabric. The term is often used when the type of implementation is not relevant to the concept or task.
- Embedded Data Fabric: This is a legacy option, and not supported on HPE Ezmeral Runtime Enterprise 5.5.0 or later releases.
- Data Fabric cluster: This is a Kubernetes cluster that is used for HPE Ezmeral Data Fabric storage. A Data Fabric cluster is a Custom Resource in Kubernetes that is supported by operators in HPE Ezmeral Runtime Enterprise.
- Node: A node is a Kubernetes host that has been added to an HPE Ezmeral Runtime Enterprise cluster.
- Data Fabric CR: This typically refers
to the Custom Resource specification for a Data Fabric cluster that is supported by an
HPE Ezmeral Runtime Enterprise
dataplatform
operator. It specifies each type of pod that the cluster would comprise. The per-pod specification may include CPU, memory, disk, and port requirements. Together with node labels and annotations, the Data Fabric CR influences the placement and scheduling of cluster pods by Kubernetes. HPE Ezmeral Runtime Enterprise creates and applies the Data Fabric CR when creating the first Data Fabric cluster. The Data Fabric CR may be subsequently patched/modified when expanding the cluster, or by a user with suitable privileges. - Core Pods: These are the pods that are specified in the
/spec/core
path of a Data Fabric CR. Some examples of core pods in a Data Fabric cluster incude CLDB, Zookeeper, MFS, and admincli pods. - Services: These are generally the pods specified in the /spec/coreservices and /spec/monitoring paths of a Data Fabric CR. Some examples of service pods in a Data Fabric cluster include MCS (HPE Ezmeral Data Fabric Control System), Kibana, and Grafana. Any non-CLDB, non-ZK, and non-MFS pod may also be referred to as a service pod.