Published May 2023
Apache Pulsar offers geo-replication as out-of-the-box functionality. This sets Pulsar apart from some other message queues that require external tools for geo-replication.
Pulsar Geo-Replication
Geo-replication is the replication of messages across multiple clusters of a Pulsar instance.
For clarification we are referring to a Pulsar instance as multiple processes of Pulsar brokers, BookKeeper bookies, and ZooKeeper.
Other distributed messaging systems, such as Apache Kafka, support geo-replication but only with the assistance of an external tool such as MirrorMaker.
Geo-replication is a built-in feature with Pulsar that is made possible in part because BookKeeper is used as the storage layer.
Both synchronous and asynchronous geo-replication is available. Synchronous replication occurs at the BookKeeper level, and asynchronous geo-replication is configured at the Pulsar broker level.
For asynchronous geo-replication, two options must be enabled. Firstly geo-replication needs to be enabled on both namespaces.
If you aren’t familiar with namespaces, let’s briefly review. Namespaces are used as a grouping mechanism for related topics. There is no limit to the number of topics included in a namespace.
Secondly, the namespace must be configured to replicate across two (or more) clusters. With these two configurations, all messages that are published to any of the topics in the namespace are automatically replicated across the clusters in the provisioned set of clusters.
For additional information on Pulsar geo-replication, visit the Pulsar documentation.
Apache Pulsar Support Services
If you are interested in 24/7 support, consulting, and/or fully managed Pulsar services, you can find more information on our Apache Pulsar services page.
Schedule a call with a Pulsar solution architect.