Setting Up Primary-Secondary Stream Replication Manually
Describes how to setup a primary-secondary stream replica that replicates in one direction.
About this task
Replica streams can be in a remote data-fabric cluster or in the data-fabric cluster where their source
streams are located. All updates from a source stream arrive at a replica stream
after having been authenticated at a gateway. Therefore, the
produceperm
access control expressions on the replica stream is
irrelevant; gateways have the implicit authority to publish messages to topics in
replica streams.
To set up primary-secondary replication of streams:
Procedure
-
Create the replica manually with the
maprcli stream create
command. Use the-copymetafrom
option to ensure that the metadata for the replica is identical to the metadata for the source stream.maprcli stream create -path <path to replica> -copymetafrom <path to source stream>
For example, to create the replica
activity
in thenewyork
cluster and use the metadata from the source stream in thesanfrancisco
cluster, you could use this command:maprcli stream create -path /mapr/newyork/activity -copymetafrom /mapr/sanfrancisco/activity
-
Register the replica as a replica of the source stream by running the
maprcli stream replica add
command.maprcli stream replica add -path <path to source stream> -replica <path to replica> -paused true
For example, to register the
activity
stream in thenewyork
cluster as a replica of theactivity
stream in thesanfrancisco
cluster, you could use this command:maprcli stream replica add -path /mapr/sanfrancisco/activity -replica /mapr/newyork/activity -paused true
The
-paused
parameter ensures that replication does not start immediately after you register the source stream as a source for this replica. You do this registration in step 4. -
Verify that you specified the correct replica by running the
maprcli stream replica list
command.maprcli stream replica list -path <path to source stream>
To verify that the
activity
stream in thenewyork
cluster is a replica of theactivity
stream in thesanfrancisco
cluster, you could look at the output of this command:maprcli stream replica list -path /mapr/sanfrancisco/activity
-
Authorize replication between the streams by defining the source stream as the
upstream stream for the replica by running the
maprcli stream upstream add
command.Definition of the upstream stream ensures that a stream cannot replicate updates to any replica. Replication depends on a two-way agreement between the owners of the two streams.maprcli stream upstream add -path <path to replica> -upstream <path to source stream>
To add the
activity
stream in thesanfrancisco
cluster as an upstream source for theactivity
stream in thenewyork
cluster:maprcli stream upstream add -path /mapr/newyork/activity -upstream /mapr/sanfrancisco/activity
-
Verify that you specified the correct source stream by running the
maprcli stream upstream list
command.maprcli stream upstream list -path <path to the replica>
To verify this in our example scenario, you could use this command:
maprcli stream upstream list -path /mapr/newyork/activity
-
Load the replica with data from the source stream by using the
mapr copystream
utility. -
Start replication with the command
maprcli stream replica resume
.maprcli stream replica resume -path <path to the source stream> -replica <path to the replica>
For our example scenario, you could use this command:
maprcli stream replica resume -path mapr/sanfrancisco/activity -replica /mapr/newyork/activity