Kafka-Technical Introduction
Topic
•Partitions
•Offsets
•Topic Replication factor
Broker
•What is a broker
•How does a broker holds topic
In this session you will be able to understand?
Kafka-Technical Introduction
Topic
A topic is particular stream of data
Topic is base of every thing in kafka
You can compare a topic with a table in database
•Like a table in database contains data, in the same way a topic in kafka contains the data
You can have as many topics you want
A topic is identified by it’s name
Topics are split in partitions
•Each partition is ordered
•Each message in partition get an incremental id, called offset
Kafka Topic
Kafka-Technical Introduction
Offset only has meaning for specific partition
•Eg: offset 3 of partition 0 doesn’t represent the same data as offset 3 in partition 1
Data order is guaranteed only with in a specific partition
Data is kept only for a limited time (Be default one week)
Once data is put in a partition it can’t be changed (immutability)
Data is assigned to a partition randomly unless a key is assigned
Offset in a partition
Kafka-Technical Introduction
A Broker is a server where kafka is installed and running
A Kafka cluster is composed of multiple brokers
A Broker in kafka cluster identified by a unique id (integer). The broker ids are arbitrary
Each broker in kafka cluster contains certain topic partitions
After connecting to any broker, you will be connected to the entire kafka cluster
Generally a broker holds the topics
Broker
Kafka-Technical Introduction
How does a Broker holds topic
Kafka-Technical Introduction
Topic replication factor
Replication factor is a number that indicates number of copies of a particular topic
For Example, if a topic is created with replication factor 3, it means that topic is created only
once but will be having two more additional copies
This is for fault tolerance purpose
Fault tolerance is the property that enables a system to continue operating properly in the
event of the failure of some of its components
Topics should have replication factor > 1 ( Usually between 2 and 3)
This way if a broker is down, another broker can serve the data
Kafka-Technical Introduction
Topic replication factor
E.g. If we lost Broker 102
Result : Still Broker 101 and Broker 103 can serve the data
Kafka-Technical Introduction
Concept of a leader for a partition
At any time only ONE broker can be the leader for a given partition
Only that leader can receive and serves the data for a partition
The other broker will synchronize the data
Therefore each partition has one leader and Multiple ISR (in-sync-replica)