EDUCBA

EDUCBA

MENUMENU
  • Free Tutorials
  • Free Courses
  • Certification Courses
  • 360+ Courses All in One Bundle
  • Login
Home Data Science Data Science Tutorials Kafka Tutorial Kafka MirrorMaker
Secondary Sidebar
Kafka Tutorial
  • Basic
    • What is Kafka?
    • Kafka Applications
    • Kafka Version
    • Kafka Use Cases
    • Kafka Consumer Group
    • Kafka Tools
    • Kafka Architecture
    • Kafka MirrorMaker
    • Kafka Console Producer
    • Kafka Console Consumer
    • Kafka Node
    • Kafka Listener
    • Kafka Cluster
    • Kafka Partition
    • Kafka Event
    • Kafka Replication
    • Kafka Monitoring
    • Kafka Zookeeper
    • Kafka Connect
    • Kafka Partition Key
    • Kafka Topic
    • Kafka burrow
    • Kafka Delete Topic
    • Kafka Replication Factor
    • Kafka Interview Questions
    • Kafka Alternatives
    • Kafka Queue
    • Kafka message
    • Kafka offset
    • Kafka Manager
    • Kafka Rebalance
    • Kafka Port
    • Kafka JDBC Connector
    • Kafka Security
    • Kafka confluent
    • Kafka Consumer
    • Kafka Producer
    • Kafka Client
    • Kafka Producer Config
    • Kafka Exporter
    • Kafka WebSocket
    • Kafka REST Proxy
    • Kafka Protocol
    • Kafka Producer Example

Related Courses

All in One Data Science Course

Pig Certification Course

Scala Certification Course

Kafka MirrorMaker

By Priya PedamkarPriya Pedamkar

Kafka MirrorMaker

Introduction to Kafka MirrorMaker

Apache Kafka has a simple tool to replicate data from two data centres. It is called  MirrorMaker, and at its base, it is a series of consumers (named streams) who are all part of the same consumer group and read data from the range of topics you have selected to replicate. It is more than getting tied together by a Kafka consumer and producer. Information will be interpreted from topics in the origin cluster and written in the destination cluster to a topic with the same name. Let us explore more about Kafka MirrorMaker by understanding its architecture.

Architecture of Kafka MirrorMaker

  • The data duplication mechanism between Kafka clusters is named “mirroring”. The mirroring function is commonly used for keeping a separate copy of a Kafka cluster in another data centre. Kafka’s MirrorMaker module reads the data from topics in one or more Kafka clusters source and writes the relevant topics to a Kafka cluster destination (using the same topic names).
  • The source and target clusters are entirely independent of having specific partition numbers and different offsets.
  • The below figure shows an example of an architecture of Kafka MirrorMaker, which aggregates messages from two different clusters into an aggregate cluster and then copying that aggregated cluster to another datacentre. 

Kafka MirrorMaker Architecture

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

  • Particularly while operating with multiple data centres, it is almost always important to copy messages between them. Therefore online applications can have access to user activity on both domains. For instance, if a user changes the personal information in their account, the update will need to be noticeable irrespective of the data centre that shows the search results.
  • Kafka clusters can only replicate within a single cluster and not between different clusters. Kafka does not allow replication within multiple clusters. Every MirrorMaker operation has one producer. The process is pretty easy. MirrorMaker runs a thread for each consumer.
  • Each user collects events from the topics and partitions allocated to him on the source cluster and uses the mutual producer to send events to the target cluster.
  • The consumers must tell the producer every 60 seconds (by default) to submit all the events to Kafka and wait before Kafka accepts them. Consumers then contact the Kafka cluster source to assign the offsets for all these events. It means no data loss (Kafka acknowledges messages until offsets are committed to the source). If the MirrorMaker mechanism fails, there will be no more than 60 seconds worth of duplicates.
  • It is essentially a consumer-producer that the consumer goes to cluster A and connects to the topic you decide, and it gets the data from there and generates all the message in cluster B; all the message that received by a topic in cluster A will be available in cluster B in the same topic.
  • Figure 1 shows an instance of architecture using MirrorMaker, which aggregates messages from two different clusters into an aggregate cluster and then copying that aggregated cluster to another datacentre.

Benefits of Kafka MirrorMaker

Below are the benefits of Kafka MirrorMaker:

All in One Data Science Bundle(360+ Courses, 50+ projects)
Python TutorialMachine LearningAWSArtificial Intelligence
TableauR ProgrammingPowerBIDeep Learning
Price
View Courses
360+ Online Courses | 50+ projects | 1500+ Hours | Verifiable Certificates | Lifetime Access
4.7 (86,112 ratings)

1. Global and Central Clusters

The organization has one or more datacentres in different geographic areas, cities or continents in some instances. Many systems can only operate by connecting with the local cluster. Still, some applications need data from multiple datacentres (otherwise, you would not be looking at approaches for cross-data centre replication). There are many situations where this is a prerequisite, but the classic example is a business that adjusts pricing based on supply and demand. That organization can have a datacenter in each area where it has a location, gather local supply and demand statistics, then adjust prices appropriately. This information is then replicated to a central cluster where business analysts will report on their sales across the group.

2. Redundancy (DR)

Programs operate on a single Kafka cluster and do not need data from other sites, but you are worried about the whole cluster’s capacity becoming unavailable for some reason. Then you would like to have a second Kafka cluster with all the data in the first cluster to direct your applications in case of an emergency.

3. Cloud Migrations

Many companies are running their business these days in both an on-site datacenter and a cloud provider. Sometimes, cloud platform programs operate on multiple regions for flexibility, and different cloud providers are sometimes used. At least one Kafka cluster is often present in each on-premise data centre and in each cloud area in these situations. Those Kafka clusters are used by applications in each datacentre and region to transfer data efficiently between the datacenters. For instance, if a new application is introduced in the cloud but needs certain data that is modified by applications running in the on-site datacenter and stored in an on-site database, you can use Kafka Connect to catch changes in the database in the local Kafka cluster and then replicate those changes in the Kafka cluster where the new application is located. It helps to control the effects of cross-data traffic and increases governance and security of the traffic.

4. Support Data and Schema Replication

Kafka MirrorMaker does support data replication by real-time streaming data between Kafka clusters and data centres. It integrates with Confluent Schema Registry for multi-dc data quality and governance. It supports connection replication by managing data integration across multiple data centres.

5. Ease of Topic Selection

It offers the advantage of flexible topic selection, by selecting topics with white-lists, black-lists, and regular expression

Conclusion

So far, we have seen what Kafka mirror maker is, what is its architecture and how Kafka mirroring works. We have also seen its use cases or benefits of why we should use Kafka MirrorMaker. In short, it just aggregates messages from two or more local clusters into an aggregate cluster. It then copies that cluster to other datacentres for redundancy, increasing throughput and fault tolerance.

Recommended Articles

This is a guide to Kafka MirrorMaker. Here we discuss what its architecture is and how Kafka mirroring works along with its Benefits. You can also go through our other suggested articles to learn more –

  1. Kafka Alternatives | Top 5
  2. RabbitMQ vs Kafka – Top differences
  3. Introduction to Kafka Tools
  4. 10 Best Kafka Interview Questions
  5. A Quick Glance of Kafka Console Consumer
  6. 6 Best Steps of Kafka Node
  7. Kafka Replication | How to Work?
Popular Course in this category
Apache Kafka Training (1 Course, 1 Project)
  1 Online Courses |  1 Hands-on Project |  7+ Hours |  Verifiable Certificate of Completion
4.5
Price

View Course

Related Courses

All in One Data Science Bundle (360+ Courses, 50+ projects)4.9
Apache Pig Training (2 Courses, 4+ Projects)4.8
Scala Programming Training (3 Courses,1Project)4.7
0 Shares
Share
Tweet
Share
Primary Sidebar
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2022 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA
Free Data Science Course

SPSS, Data visualization with Python, Matplotlib Library, Seaborn Package

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more