In this slides, the speaker shares their experiences in the IT industry, focusing on the integration of Apache Kafka with MuleSoft. They start by providing an overview of Kafka, detailing its pub-sub model, its ability to handle large volumes of data, and its role in real-time data pipelines and ana...
In this slides, the speaker shares their experiences in the IT industry, focusing on the integration of Apache Kafka with MuleSoft. They start by providing an overview of Kafka, detailing its pub-sub model, its ability to handle large volumes of data, and its role in real-time data pipelines and analytics. The speaker then explains Kafka's architecture, covering topics such as partitions, producers, consumers, brokers, and replication.
The discussion moves on to Kafka connector operations within MuleSoft, including publish, consume, commit, and seek, which are demonstrated in a practical demo. The speaker also emphasizes important design considerations like connector configuration, flow design, topic management, consumer group management, offset management, and logging. The session wraps up with a Q&A segment where various Kafka-related queries are addressed.
Size: 37.71 MB
Language: en
Added: Jul 13, 2024
Slides: 19 pages
Slide Content
Bangalore/Patna MuleSoft Meetup Integrating Kafka with Mule 4
Safe Harbour Statement Both the speaker and the host are organizing this meet-up in individual capacity only. We are not representing our companies here. This presentation is strictly for learning purposes only. Organizer/Presenter do not hold any responsibility that same solution will work for your business requirements. This presentation is not meant for any promotional activities.
A recording of this meetup will be uploaded to events page within 24 hours. Questions can be submitted/asked at any time in the Chat/Questions & Answers Tab. Make it more Interactive!!! Share us the feedback! Rate this meetup session by filling feedback form at the end of the day. We Love Feedbacks !! Housekeeping
SPEAKERS Guru Pratheek J K W orking as Di gital Engineering Senior Engineer at NTT Data North Americas 4 years of Total Experience 3.5 + years of experience in MuleSoft 5x MuleSoft Certified Math and Cricket Enthusiast
APACHE KAFKA WITH MULE 4
OBJECTIVE Overview What is Kafka? Apache Kafka in mule 4 Kafka Connector and Operations Demo Design Considerations for Kafka Applications
Kafka is a distributed event streaming platform capable of handling trillions of events. Developed by LinkedIn and open-sourced through Apache in 2011 Widely used for building real-time data pipelines and streaming applications. Enables easy connection to non-Mule applications using the REST API. Kafka’s design focuses on high-throughput and low-latency, making it suitable for real-time data processing and integration scenarios. Displays usage statistics on the number of messages and API requests. KAFKA / APACHE KAFKA
TYPES OF KAFKA Apache Kafka : The open-source, distributed event streaming platform. Can be deployed on-premises, in the cloud, or in hybrid environments. Confluent Kafka : A commercial distribution of Apache Kafka developed by Confluent, offering additional features and support. Available as Confluent Platform for on-premises and cloud deployments, and Confluent Cloud, a fully managed Kafka service.
KAFKA ARCHITECTURE
KAFKA TOPIC
KAFKA COMPONENTS Producers : Producers are applications or services that send data to Kafka topics. Consumers: Consumers are applications or services that read data from Kafka topics. Topics: They act as a logical channel to which producers send messages and from which consumers read messages, analogous to a Queue Partitions: Topics are further divided into partitions, each being an ordered sequence of messages. Brokers : Brokers are Kafka servers that store and manage data,they handle the incoming data streams, maintain data, and serve consumer requests. Replication : Kafka replicates data across multiple brokers to ensure fault tolerance. Consumer Groups : A consumer group is a group of consumers that work together to consume messages from a topic.It allows for load balancing of message consumption. Offset : An offset is a unique identifier for each message within a partition.It keeps track of the position of the consumer in the partition.
High Throughput and Performance Scalable Architecture Durability and Persistence Fault Tolerance and Reliability Real-time Data Processing Decoupling of Data Streams Stream Processing Capabilities ADVANTAGES OF KAFKA Exactly-once Semantics Create client applications tokens. Rich Ecosystem and Integration Efficient Storage Management Flexible Deployment Options Cost-Effective Strong Community Support
DESIGN CONSIDERATIONS Connector Configuration : Properly configure the Kafka connector in Mule 4 to handle Kafka topics, consumer groups, and producer configurations. Flow Design : Design Mule flows to handle message consumption, processing, and production efficiently. Transformations : Use DataWeave for data transformation between Kafka message formats and API payloads only when it’s necessary, especially if the volume is high Topic Management : Design Kafka topics based on business requirements. Consider topic naming conventions, partitioning strategies, and replication factors. Consumer Group Management : Use appropriate consumer groups to balance the load to ensure that each message is processed exactly once. Offset Management : Implement custom offset management if needed. Mule Workers: The number of mule workers/replicas should ideally match the number of partitions of the topic Logging: Avoid l ogging the Payload of the kafka as it can easily exhaust the log limits based on volume and Subscription level of Mulesoft licence