Using HPE Ezmeral Data Fabric Monitoring (Spyglass Initiative)
HPE Ezmeral Data Fabric Monitoring (part of the Spyglass initiative) provides the ability to collect, store, and view metrics and logs for nodes, services, and jobs/applications.
Metric Monitoring
Administrators can monitor the current status of the cluster and anticipate future cluster
requirements with dashboards. For example, you can use metrics dashboards to visualize the following:
- Storage Utilization
- Use metrics dashboards to monitor storage trends. For example, you can compare the volume of file system usage at different times to the file system capacity and then allocate resources to the file system accordingly.
- Node Utilization
- Use metrics dashboards to check for node overload. For example, if the CPU usage is high on a few nodes, you may want to distribute the load across more nodes for better performance and efficiency.
- HPE Ezmeral Data Fabric Database Operational Trends
- Use metrics dashboards to display historical trends for HPE Ezmeral Data Fabric Database operations. For example, if a user reports HPE Ezmeral Data Fabric Database slowness, the historical trends associated with row scans, get, and put operations can be used to identify the node(s) on which the performance degradation occurs.
Log Monitoring
Administrators can use dashboards to visualize, search, and review logs when
troubleshooting issues. For example, you can use log dashboards to troubleshoot the
following issues:
- Service Failures
- When metrics indicate that one or more services are down, use log dashboards to check the logs for each failed service and drill-down to each associated node.
- Application Failures
- When an application or job fails, use log dashboard to identify possible bottlenecks. For example, you can search the logs for a given application ID across all the nodes in the cluster.
- file system Performance
- When users experience file system or NFS for the HPE Ezmeral Data Fabric slowness, use log dashboards to search the HPE Ezmeral Data Fabric file system logs for service errors or application issues.