Notes on mutil-dc replication of kafka
Based on https://www.slideshare.net/HadoopSummit/building-largescale-stream-infrastructures-across-multiple-data-centers-with-apache-kafka
Put replicas directly across DCs
-
on failed dc, the consumer will just switch to another replica.
-
when dc recovers, the existing catch up mechanism will kick in
-
If replicas are across region, latency will be high, plus demands a lot of cross dc bandwidth from replicator - hard for Kafka
Active-Passive
-
consumer on either active or passive dc
-
upon failure, the passive dc becomes the new active, consumer may need to switch too
-
note that offset may not match, because of at-least-once semantics of producer
-
normally for real time consumers, just start from the end and accept data loss
-
When the dc comes back, need to MM back changes from the new active to the former active - hard to manage the DC offset problem again.
Active-Active
-
each one has an active and an aggregate cluster
-
On failure and recovery, no need to reconfigure MM
-
To avoid aggregate cluster, we can preifx topics with DC tag,and confg MM to mirror remote topic only. and consumer need to sub to topics with both DC tags
How to make DB data available in all DCs
- active-active: same consumer concurrently in both DCs
- active-passive: only one consumer per dc at any given time
Ideally DB replication policy should be same as kafka cluster’s