EDUCBA

EDUCBA

MENUMENU
  • Free Tutorials
  • Free Courses
  • Certification Courses
  • 360+ Courses All in One Bundle
  • Login
Home Data Science Data Science Tutorials Kafka Tutorial Kafka Replication
Secondary Sidebar
Kafka Tutorial
  • Basic
    • What is Kafka?
    • Kafka Applications
    • Kafka Version
    • Kafka Use Cases
    • Kafka Consumer Group
    • Kafka Tools
    • Kafka Architecture
    • Kafka MirrorMaker
    • Kafka Console Producer
    • Kafka Console Consumer
    • Kafka Node
    • Kafka Listener
    • Kafka Cluster
    • Kafka Partition
    • Kafka Event
    • Kafka Replication
    • Kafka Monitoring
    • Kafka Zookeeper
    • Kafka Connect
    • Kafka Partition Key
    • Kafka Topic
    • Kafka burrow
    • Kafka Delete Topic
    • Kafka Replication Factor
    • Kafka Interview Questions
    • Kafka Alternatives
    • Kafka Queue
    • Kafka message
    • Kafka offset
    • Kafka Manager
    • Kafka Rebalance
    • Kafka Port
    • Kafka JDBC Connector
    • Kafka Security
    • Kafka confluent
    • Kafka Consumer
    • Kafka Producer
    • Kafka Client
    • Kafka Producer Config
    • Kafka Exporter
    • Kafka WebSocket
    • Kafka REST Proxy
    • Kafka Protocol
    • Kafka Producer Example

Related Courses

All in One Data Science Course

Pig Certification Course

Scala Certification Course

Kafka Replication

By Priya PedamkarPriya Pedamkar

Kafka Replication

Introduction to Kafka Replication

Replication is at the centre of architecture in Kafka. Kafka’s official documentation’s very first line sums it up as “a distributed, partitioned, replicated commit log service.” Replication is important because it is the way Kafka ensures availability and reliability when individual nodes eventually fail. Replication is the practice of having multiple copies of the data for the main aim of availability if one of the brokers fails to satisfy the requests. In Kafka, replication occurs at partition granularity, i.e. copies of the partition are stored in several broker instances using the partition’s write-ahead log.

What is Replication?

As we know, data in Kafka is grouped by topics. Every topic is partitioned, and every partition can have several replicas. These replicas are stored on brokers, and typically each broker stores more than hundreds of replicas from various topics and partitions. In Kafka, replication occurs at partition granularity, i.e. copies of the partition are stored in several broker instances using the partition’s write-ahead log. Every partition has a write-ahead log where all the messages are stored for that partition in the Kafka topic. Each message has an offset which acts as a unique identifier.

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

All in One Data Science Bundle(360+ Courses, 50+ projects)
Python TutorialMachine LearningAWSArtificial Intelligence
TableauR ProgrammingPowerBIDeep Learning
Price
View Courses
360+ Online Courses | 50+ projects | 1500+ Hours | Verifiable Certificates | Lifetime Access
4.7 (86,112 ratings)

The replication factor determines the number of copies that must be held of the partition. There are two different types of replicas:

Leader Replica: Each partition is designated as the leader with a single replica. All demands for producing and consuming go through the leader to ensure consistency.

Followers Replica: All replicas are called followers for a partition that is not leaders. Followers do not answer client requests; their only task is to replicate the leader’s messages and keep up to date with the leader’s most recent messages. If a leader replica crashes for a partition, then one of the follower replicas will be promoted to become the new partition leader.

How does replication work in Kafka?

As we have read above, what is replication in Kafka now? Let us learn more about how does this replication take place in Kafka.

In a Kafka topic, every partition contains a write-ahead log that stores the messages. Each message can be identified with its unique identifier called offset, which specifies its location in the partition log.

Partition’s write-ahead log

Partition’s write-ahead log

Each partition in the Kafka topic is replicated n times, where n stands for the topic’s replication factor. The replication factor determines the number of copies that must be held for the partition. If a cluster server fails, Kafka will finally be able to get back to work because of replication. From all of the total n replicas, there will be one dedicated replica that will be assigned as a leader. As the name suggests, the leader takes the writings from the producer, and the followers simply copy the log of the leader in order.

Kafka Cluster with replication factor 2

Kafka Cluster with replication factor 2

From the figure above, it is clear that the replication factor of 2 is specified, which means that there will be two copies for each partition.

So let’s see what happens when a broker going down. Let’s say Broker 3 goes down for some reason. Connection to partition 3 is now lost as the leader for partition 3 was broker 3. What’s going to happen now is that Kafka automatically picks one of the in-sync replicas (in this case, there is only 1 replica) and makes them the leader. Now when broker 3 comes online, it can seek to become the leader again. The leader manages

the in-sync replica (ISR) list for each partition by measuring each replica’s latency on its own. When a producer sends a message to the broker, the leader copies it and replicates it to all partition replicas. A message will only be committed after it has been effectively copied to all replicas in sync. Use the setting “acks,” producers may opt to receive acknowledgements for the data writes to the partition.

Let’s see what’s meaning for acks is in Apache Kafka and how to set them.

An acknowledgement (ACK) is a sign transmitted between communication networks to indicate acknowledgement which means reception of the message transmitted successfully. There are three types of acknowledgement that users can select from based on their Kafka use case.

  • Acks=0:- Once the ack value is set to 0, the producer will not wait for an acknowledgement from the broker. So we don’t have any assurance that the broker has received the message successfully or The producer is not trying to send the message again as it won’t realize the record has been lost, so there are chances of data loss.
  • Acks=1:- When the ack value is set to 1, the producer gets an acknowledgement after the record has been received by the It will reply without waiting for all followers for a complete acknowledgement. The message will only be lost if the leader fails directly after noticing the record even before the followers have replicated it. There is partial data loss.
  • Acks=all:- When we set the ack value to all, it ensures that when all in-sync replicas get the data, the producer gets a The leader must wait to accept the record with the full set of in-sync replicas. This implies that sending a message with acks=all; takes a long time, but it does provide the best durability of the message, which means no data loss.

Importance of using replication in Kafka

Kafka clients will get the following benefits with replication.

  • A producer can continue to publish messages during failure and may choose between latency and durability based on the use.
  • A consumer always gets the right messages in real-time, even in the event of a

Conclusion

So far, we’ve seen what Kafka replication is, how it works, and why it’s important. Kafka ensures that if a message is acknowledged as committed, messages will not get lost even in the event of a leader breakdown.

Recommended Articles

This is a guide to Kafka Replication. Here we discuss introducing Kafka Replication, what is replication, how it works, and its importance. You can also go through our other related articles to learn more –

  1. Kafka Console Consumer
  2. Kafka Alternatives
  3. Kafka MirrorMaker
  4. Kafka Tools
Popular Course in this category
Apache Kafka Training (1 Course, 1 Project)
  1 Online Courses |  1 Hands-on Project |  7+ Hours |  Verifiable Certificate of Completion
4.5
Price

View Course

Related Courses

All in One Data Science Bundle (360+ Courses, 50+ projects)4.9
Apache Pig Training (2 Courses, 4+ Projects)4.8
Scala Programming Training (3 Courses,1Project)4.7
0 Shares
Share
Tweet
Share
Primary Sidebar
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2022 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA
Free Data Science Course

SPSS, Data visualization with Python, Matplotlib Library, Seaborn Package

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more