Primary-Secondary Replication
In this topology, you replicate one way from source tables to replicas. The replicas can be in a remote cluster or in the cluster where the source tables are located.
Several topologies are possible for primary-secondary replication:
Replication from one source table to one or more replica tables
In this topology, updates on a source table are replicated to one or more replicas, but updates to the replicas are not replicated back to the source table.
For example, in this diagram, updates to the customers table in the
cluster sanfrancisco are being replicated to the newyork
and hyderabad clusters. The circles marked G each represent a
HPE Data Fabric gateway.
However, changes to the table in the newyork and
hyderabad clusters are not replicated back to the table in the
sanfrancisco cluster.
You can also replicate within a single cluster. In this example, the cluster
sanfrancisco contains both the source table and the replica.
Many-to-one replication
Multiple source tables can replicate to a single replica. In this diagram, operations on
customers tables in three different clusters are replicated via
gateways to the customers table in the newyork
cluster.
One-to-many replication
A single source table can replicate to multiple replicas. In this diagram, operations on
the customers table in the sanfrancisco cluster are
replicated via gateways to replicas in three other clusters.
Replication loops
When three or more tables need to be kept in sync, you can set up primary-secondary replication between pairs of them to form a replication loop. Operations on a table are propagated to the other clusters in the loop, but there is no attempt to reapply the operations at the originating table. This is because the operations are tagged with a universally unique identifier (UUID) that identifies the table where the operations originated.
In this diagram, for example, operations on the customers table in the
hyderabad cluster are replicated first to the
customers table in the tokyo cluster. The operations
are then replicated from the tokyo cluster to the
customers table in the sanfrancisco cluster. Finally,
the operations are replicated from the sanfrancisco cluster to the
customers table in the newyork cluster. The
newyork cluster does not replicate the operations to the
customers table in the hyderabad cluster.
Primary-Secondary replication in two directions
You can combine primary-secondary replication configurations to replicate data between clusters. Two clusters engaged in replication can each act as a source cluster and a destination cluster.
In this example, the data in the customers table in the cluster
sanfrancisco is replicated to the customers table in
the cluster newyork. At the same time, the data in the
products table in the newyork cluster is replicated to
the products table in the cluster sanfrancisco.
In all primary-secondary configurations, changes made to replica tables are not replicated back to source tables. Therefore, if the replicated data is modified at the replica by client applications, the replica will become out of sync with the source table.
For example, you might replicate the two column families personal and
purchases from the customer table in the
sanfrancisco cluster to the customers table in the
newyork cluster, as in this diagram. (For simplicity, the blue circle
labeled G represents two or more gateways, rather than one as in the other diagrams in this
topic.)
In primary-secondary replication, no updates to a replica are replicated back to the
source. Any updates that applications might make to those two column families in the
customers table in the newyork cluster will not be
replicated to the customers table in the sanfrancisco
cluster.
However, you do not have to protect a replica from all updates that are not due to
replication. For example, the customers table in the
newyork cluster might have an additional column family that is not
populated with replicated data: reviews.