Spyglass on Streams
Release 6.0 of the HPE Ezmeral Data Fabric introduced Spyglass on Streams. When you install release 6.0 or later, Streams is the default mechanism through which metrics flow from the Collectd service to OpenTSDB. Moving metrics through streams secures the data and provides a mechanism to perform real-time data analytics.
The Flow of Metrics via Streams
The Collectd service collects node-level and service-level metrics from each node in the cluster. The Collectd service hashes metrics to a stream and writes the metrics into topics in that stream.
In release 6.1.0 and later, Collectd creates one stream per cluster:
/var/mapr/mapr.monitoring/metricstreams/0
. Topic names use the format
<hostname>
. For example:
mfs81.qa.lab
.
tsdb_cluster_mgmt.sh
) that runs periodically.Determining How Many OpenTSDB Nodes to Install
Having multiple OpenTSDB nodes in the cluster distributes the workload. The number of partitions and OpenTSDB nodes determines the level of parallelism for consumption.
Each OpenTSDB node can consume one partition at a time. By default, metrics data is divided across 12 partitions in each topic and optimal parallelism is reached if there are five OpenTSDB nodes to consume the partitions. See Parallelism When Consuming Messages. Note that the term “consumer” in the topic equates to an OpenTSDB node in Spyglass on Streams.
- Three OpenTSDB nodes in a 10-node cluster
- Four OpenTSDB nodes in a 100-node cluster
- Five OpenTSDB nodes in a 1000-node cluster
If your cluster has 10 or more nodes, at least three OpenTSDB nodes should be available to consume metrics. Typically, for every 10x increase in nodes, you should add another OpenTSDB node. For example, if your cluster reaches a size of 100 nodes, have four OpenTSDB nodes available for consumption.
/opt/mapr/asynchbase/asynchbase-<version>/conf/asynchbase.conf
file
on the OpenTSDB
nodes:"fs.mapr.async.worker.threads=<value>"
Increasing the Number of Streams
For release 6.1 and later, the default setting for the number of streams is one. Even if your cluster grows to 1000 nodes or more, you do not need to increase the number of streams. For release 6.0.x, increasing the number of streams is recommended as you add more nodes (see the release 6.0 documentation), but this practice is not required in release 6.1 and later.
Changing the Automatic Stream Cursor Commits
You can adjust the frequency of automatic stream cursor commits for OpenTSDB. Modify the
tsd.streams.autocommit.interval
in opentsdb.conf
The
unit is thousands of seconds. The default value is '60000' which is 60 secs. For a system
with heavy loads, consider changing the value to something like 5 minutes.