Alerting

Describes alerting in HPE Ezmeral Unified Analytics Software.

An alert in HPE Ezmeral Unified Analytics Software is a system notification that informs you of issues, warnings, and updates. Unified Analytics uses Prometheus to monitor and collect metrics from nodes, system processes, and applications that run in an HPE Ezmeral Unified Analytics Software cluster. Unified Analytics generates alerts based on the metrics collected. An Alertmanager in Unified Analytics enables you to control the behavior of alerts, for example, silence specific alerts or send notifications to a specific user when the system raises an alert.

To learn about Prometheus and Alertmanager in detail, see the Prometheus and Alertmanager documentation.

The alert system in HPE Ezmeral Unified Analytics Software is comprised of several components. The following sections include an architectural diagram, component descriptions, and alerting workflow.

Alerting Worflow

The following is an overview of the alerting workflow along with a detailed description in HPE Ezmeral Unified Analytics Software.

Collect Metrics

Prometheus scrapes metrics from targets exposed by exporters. For example, Prometheus collects the CPU usage metrics from servers via Node Exporter, and database query latency from MySQL via Mysqld Exporter.

Evaluate Alert Rules

Prometheus continuously evaluates the alerting rules that are defined in PromQL against the collected metrics.

Generate Alerts

If the condition for a rule is met, Prometheus generates an alert. For example:

The following alert rules send notifications if an average API server error rate exceeds 5 per minute.
```
 avg(http_requests_total{job="api_server", status_code="500"}) by (job) > 5
```
The following alert rules send notifications if the disk has less than 10GB free.
```
node_filesystem_avail_bytes{mountpoint="/"} < 10 * 1024 * 1024 * 1024
```

Dispatch Alerts to Alertmanager

Prometheus sends the generated alerts to the configured Alertmanager.

Process Alerts

Alertmanager deduplicates, groups, and routes the alerts based on configured rules.

Send Notifications to Receivers

Alertmanager sends notifications to the appropriate recipients through the designated channels.

Resource Events Alerting

Alerts are triggered for the following events:

High resource CPU usage
High resource memory usage
Unusual pod restart
Pods not in running state
PVC status not Bound
Failed jobs
Failed cronjobs
Node failures
Unsual node memory or CPU usage behavior
Kubelet failures
Node filesystem issues
Node network issues
Prometheus issues

To find the list of alerts generated in HPE Ezmeral Unified Analytics Software, see List of Alerts.

HPE Ezmeral Unified Analytics Software 1.5 Documentation
Abstract	HPE Ezmeral Unified Analytics Software is a usage-based Software-as-a-Service (SaaS) model that operationalizes hybrid and multi-cloud modern analytical workloads through a simple user interface, easily installed and deployed in minutes. HPE Ezmeral Unified Analytics Software separates compute and storage for flexible, cost-efficient scalability to securely access data stored in multiple data platforms, enabling you to run traditional and advanced analytics workloads with open-source tools.
Published	June 2025
Edition	1.5.0
Topic last updated	2024-05-22