Gateways and Stream Replication
When replicating streams, HPE Ezmeral Data Fabric Streams replicates messages that are published to a source stream. Gateways are services that receive messages from source streams and publish them in replica streams.
You configure gateways on nodes that are in destination clusters. On source clusters, you list the destination clusters and the gateways that are running on them.
During replication, HPE Ezmeral Data Fabric Streams sends messages from source streams to the gateways on the destination clusters, where the replicas of those source streams are located. Gateways batch the messages and then apply them to replicas. All messages from a source stream arrive at a replica after having been authenticated at a gateway. Therefore, access control expressions on the replica that control permission to publish messages are irrelevant; gateways have the implicit authority to publish messages to replicas.
HPE Ezmeral Data Fabric Streams distributes messages to a destination cluster’s gateways in round-robin fashion. If a gateway is down or unreachable, HPE Ezmeral Data Fabric Streams chooses another gateway. If all of the gateways are down, HPE Ezmeral Data Fabric Streams retries the operation periodically until a gateway comes online.
You must configure gateways in destination clusters. If the destination cluster is remote from the cluster in which a source stream is located, then the gateways must be in the remote cluster. If the destination cluster is the source cluster, meaning that a source stream and its replica are located in a single cluster, then the gateways must be in the local cluster.
In a Primary-Secondary setup, you cannot have two primary instances with the same topic name replicating to the same secondary instance. It creates a conflict for that topic name. This is similar to Multi-Master replication where you must have separate topic names for Master1 (Cluster1) and Master2 (Cluster2).
For more information about replicating streams, see Stream Replication.
Gateways on nodes in remote destination Data Fabric clusters
In this type of topology, gateways receive messages that are published to source streams, authenticate with the destination cluster on behalf of the source cluster, and publish the messages to the corresponding streams.
This diagram of basic intercluster primary-secondary
replication shows messages from the activity
stream in the cluster
sanfrancisco
being sent to gateways. The gateways then publish the
messages to the replica stream that is in the cluster newyork
.
The
gateways on a destination cluster are not assigned to particular replicas. They publish
messages to all replicas on the destination cluster. For example, in the following diagram,
messages from two source streams in the cluster sanfrancisco
are replicated
to two replicas in the cluster newyork
. There are four gateways. Each
gateway receives messages from both source streams, and each gateway applies those messages
to the corresponding replicas.
Gateways on nodes within a Data Fabric cluster serving as source and destination
In this type of topology, gateways also receive messages that are published to source streams and publish the streams to the replicas. However, all of this activity takes place within a single Data Fabric cluster.
The following schematic
diagram of basic intracluster primary-secondary replication shows messages from the
activity1
stream in the cluster sanfrancisco
being sent
to gateways. The gateways then publish the messages to the stream
activity2
.