Monitoring and Alerting

This article describes monitoring and alerting. Also see the following articles for additional information:

Monitoring

Metricbeat collects data from the containers by running the container stats or other container commands. It also retrieves system-level information by reading cgroup data from the OS/proc files. Metricbeat then provides the collected metrics to Elasticsearch, where the data can be visualized on dashboards or through the Kibana dashboard.
NOTE Kibana is only available for Kubernetes clusters running HPE Ezmeral Data Fabric. See HPE Ezmeral Data Fabric Introduction.
When platform-level HA is enabled (see High Availability), Elasticsearch will run on three hosts to ensure data replication and backup. Metricbeat is a lightweight service with minimal memory requirements.

The high-level workflow is as follows:

  1. Metricbeat captures monitoring information and provides this data to Elasticsearch.
  2. When platform HA is enabled, Elasticsearch replicates this data across the Controller, Shadow Controller, and Arbiter hosts.
  3. Elasticsearch data can be visualized using either a Dashboard screen or through Kibana.

To access Kibana, see the following:

  • If this is a Kubernetes deployment of HPE Ezmeral Runtime Enterprise, open the The Kubernetes Clusters Screen screen. The Details column of the cluster contains a link to the Kibana service. Links to services are not shown when HPE Ezmeral Runtime Enterprise is in Lockdown mode.

    For default user name and password information for Kibana and Grafana on Data Fabric clusters, see Managing HPE Ezmeral Data Fabric on Kubernetes.

Alerting

Nagios runs as a container on the Controller host. The Nagios implementation is open source with no customization; however, a few Nagios scripts are included to monitor and provide alerts for some specific services. These scripts are located in the /usr/lib64/nagios/plugins directory.

There are two ways to configure Nagios alerts: