Kafka Performance impacts by below factors
Disk (I/O):
Reads are done sequentially therefore make sure you should have corresponding disk type
Have the disk type as XFS format (easiest, no tuning is required)
If read/write throughput is your bottle neck
•It is possible to mount the kafkadata in multiple disks
•The config is log.dirs=/disk1/kafka-logs,disk2/kafka-logs,disk3/kafka-logs ….
Kafka performance is constant with regards to the amount of data stored in kafka
•Make sure data is expired fast enough (default is one week)
•Make sure you monitor disk performance
Apache Kafka –Performance
Kafka Performance impacts by below factors
Network:
Latency is the key in kafka
•Make sure your kafkainstances and zookeeper instances are geographically close
•Don’t put one broker in USA and another broker of same cluster in Europe. It makes no sense
•Having two brokers live in same rack is good performance, but a big risk if rack goes down
Network bandwidth is key in kafka
•Network will be your bottleneck
•Having enough bandwidth handle multiple connections and TCP request in efficient manner
•Make sure network is high performance
Monitor network usage to troubleshoot when network is bottleneck
Apache Kafka –Performance
Kafka Performance impacts by below factors
RAM:
Kafka has amazing performance. Big thanks to page cache which utilizes RAM
Understanding RAM in kafkameans understanding two parts
•The Java HEAP for kafkaprocess
•The rest of the RAM used by OS page cache
Overall, kafkaproduction machine should be sized at least 8 GB of RAM (Its
good if you have 16GB or 32 GB)
Apache Kafka –Performance
Kafka Performance impacts by below factors
RAM –Java HEAP:
When you launch kafka, you specify Kafka Heap options
(KAFKA_HEAP_OPTS environment variable)
Recommendation is to assign a MAX amount (-Xmx) of 4 GB to get started
to kafkaHEAP
export KAFKA_HEAP_OPTS=“-Xmx4g”
Kafka should keep a low heap usage over time, and heap will increase
when you have more partitions in your broker
Apache Kafka –Performance
Kafka Performance impacts by below factors
RAM –OS Page Cache:
The remaining RAM will be automatically used for the Linux OS Page Cache
This is used to buffer the data to disk and this is what gives kafkaamazing
performance
You don’t have to specify anything !!!
Any Un-used memory will be automatically leveraged by the operating
system and that will automatically assigned to OS Page Cache
Apache Kafka –Performance
Kafka Performance impacts by below factors
CPU:
Generally CPU won’t be a bottle neck for kafka because there is no parsing in kafka
If you have SSL enabled, Kafka has to encrypt and decrypt every payload which can all load to
CPU
Data compression can impact CPU performance in some cases if you force kafka to do it.
Instead make sure producer and consumer does the compression work and sends the
compressed data to kafka
Apache Kafka –Performance
Kafka Performance impacts by below factors
OS:
Use Linux or solarisoperating systems to run kafka cluster
Don’t use windows especially for production deployments
Increase file descriptor limits (at lease 100,000 as a straight point)
https://unix.stackexchange.com/a/8949/170887
Make sure only kafka is running on your operating system. Anything else will just slow down
the machine
Make sure Java 8 and scala2.11
Apache Kafka –Performance