Saving Cursor Position

The HPE Ezmeral Data Fabric Streams server uses cursors to keep track of the messages that consumers in consumer groups have read.

There is one cursor per partition per consumer group. There are two kinds of cursors: read cursors and committed cursors.

Figure 1. A topic partition and the cursors of a consumer group

A consumer's read cursor is the offset of the most recent message that HPE Ezmeral Data Fabric Streams has sent to a consumer from a partition.

Consumers that are part of a consumer group can save the current position of their read cursor. Consumers can do this either automatically or manually. The saved cursor is called a committed cursor because it indicates that the consumer has processed all messages in a partition up to and including the one with this offset.

There are two benefits to committing cursors:
Failover on consumer failure
One benefit is that if a consumer fails and HPE Ezmeral Data Fabric Streams reassigns the consumer’s partitions to other consumers in a group, those consumers can start reading from the next offset after the committed cursor in each of those partitions.
Failover on cluster failure
When you backup a stream by replicating it to another cluster, committed cursors are also replicated. If the main cluster fails, consumers that are redirected to the standby copy of a stream can start reading from the next offset after committed cursors.
Read cursors
A consumer's read cursor is the offset of the most recent message that HPE Ezmeral Data Fabric Streams has sent to a consumer from a partition.
Committed cursors
Consumers that are part of a consumer group can save the current position of their read cursor. Consumers can do this either automatically or manually. The saved cursor is called a committed cursor because it indicates that the consumer has processed all messages in a partition up to and including the one with this offset.

How often a consumer should commit depends on how much read duplication you are willing to tolerate. The more often a consumer commits, the less read duplication with which the consumer must contend.

The length of time since the failed consumer last committed determines (together with the rate at which messages are published to its partitions) how many messages are read a second time. For example, suppose that the auto-commit interval is five seconds. A consumer saves its commit cursor and then fails after three seconds. During those three seconds, the consumer's read cursor has continued to move through the messages. When its partitions are reassigned to other consumers in the group, those consumers will read three seconds of messages that the failed consumer already read.

There are two ways of committing cursors:
Automatic commits
The HPE Ezmeral Data Fabric Streams server commits the cursors for a consumer that is in a consumer group based on the value of the enable.auto.commit configuration parameter. Set this parameter to true to enable auto-commit. The default value is true.
The auto.commit.interval.ms configuration parameter determines the frequency of the commits in milliseconds. The default is value is 1000.
Manual commits
The Java API provides a method of committing cursors manually.