Apache Kafka 101 by Confluent Developer Friendly

itplanningandarchite 61 views 43 slides May 03, 2024
Slide 1
Slide 1 of 110
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95
Slide 96
96
Slide 97
97
Slide 98
98
Slide 99
99
Slide 100
100
Slide 101
101
Slide 102
102
Slide 103
103
Slide 104
104
Slide 105
105
Slide 106
106
Slide 107
107
Slide 108
108
Slide 109
109
Slide 110
110

About This Presentation

Read basic of implementing Apache Kafka.


Slide Content

Apache Kafka ® 101

What’s Covered Introduction Hands On: Your First Kafka Application in 10 Minutes or Less Topics Partitioning Hands On: Partitioning Brokers Replication Producers Consumers Hands On: Consumers Ecosystem Kafka Connect Hands On: Kafka Connect Confluent Schema Registry Hands On: Confluent Schema Registry Kafka Streams ksqlDB Hands On: ksqlDB

Introduction

Event Internet of Things

Event Internet of Things Business process change

Event Internet of Things Business process change User Interaction

Event Internet of Things Business process change User Interaction Microservice output

Notification + State

Key Value

Hands On: Your First Kafka Application in 10 Minutes or Less

GET STARTED TODAY Confluent Cloud cnfl.io/confluent-cloud PROMO CODE: KAFKA101 $101 of free usage

Hands On: Your First Kafka Application in 10 Minutes or Less

Topics

Topics

Topics Named container for similar events

Topics Named container for similar events System contains lots of topics Can duplicate data between topics

Topics Named container for similar events System contains lots of topics Can duplicate data between topics Durable logs of events

Topics Named container for similar events System contains lots of topics Can duplicate data between topics Durable logs of events Append only

Topics Named container for similar events System contains lots of topics Can duplicate data between topics Durable logs of events Append only Can only seek by offset, not indexed

Topics Named container for similar events System contains lots of topics Can duplicate data between topics Durable logs of events Append only Can only seek by offset, not indexed Events are immutable

Topics are durable

Retention period is configurable

Partitioning

Partition 0 Partition 1 Partition 2 Partitioned Topic

Partition 0 Partition 1 Partition 2

Partition 0 Partition 1 Partition 2

Partition 0 Partition 1 Partition 2 1 4 7 2 5 8 3 6 9 #

Partition 0 Partition 1 Partition 2 1 2 3 4 5 7 6 8 9 #

Partition 0 Partition 1 Partition 2 # 1 2 3 4 5 7 6 8 9

Hands On: Partitioning

Hands On: Partitioning

Brokers

Brokers An computer, instance, or container running the Kafka process

Brokers An computer, instance, or container running the Kafka process Manage partitions Handle write and read requests

Brokers An computer, instance, or container running the Kafka process Manage partitions Handle write and read requests Manage replication of partitions

Brokers An computer, instance, or container running the Kafka process Manage partitions Handle write and read requests Manage replication of partitions Intentionally very simple

Replication

Partition 0 Partition 1 Partition 2 Partitioned Topic

Replication Copies of data for fault tolerance

Replication Copies of data for fault tolerance One lead partition and N-1 followers

Replication Copies of data for fault tolerance One lead partition and N-1 followers In general, writes and reads happen to the leader

Replication Copies of data for fault tolerance One lead partition and N-1 followers In general, writes and reads happen to the leader An invisible process to most developers

Replication Copies of data for fault tolerance One lead partition and N-1 followers In general, writes and reads happen to the leader An invisible process to most developers Tunable in the producer

Producers

Partition 0 Partition 1 Partition 2 Partitioned Topic Producer

{ final Properties props = KafkaProducerApplication. loadProperties (args[ ]); f inal String topic = props.getProperty( “output.topic.name” ); final Producer<String, String> producer = new KafkaProducer<>(props); final KafkaProducerApplication producerApp = new KafkaProducerApplication(producer, topic); }

{ final ProducerRecord<String, String> producerRecord = new ProducerRecord <>( outTopic , key, value); return p roducer .send(producerRecord); }

Producers Client application Puts messages into topics Connection pooling Network buffering Partitioning

Partition 0 Partition 1 Partition 2 Partitioned Topic Producer

Consum ers

{ final Properties consumerAppProps = KafkaConsumerApplication. load Properties(args[0]); final String filePath = consumerAppProps.getProperty( “file.path” ); final Consumer<String, String> consumer = new KafkaConsumer <>(consumerAppProps); final ConsumerRecordsHandler<String, String> recordsHandler = new FilewritingrecordsHandler(Paths.get(filePa … final KafkaConsumerapplication consumerApplication = new KafkaConsumerApplication(consumer, recordsHandler);

@ public void runConsume ( final Properties consumerProps) { try { consumer . subscribe (Collections. singletonList (consumerProps.getProperty( “input.topic.name” ))); while ( keepConsuming ) { final ConsumerRecords<String, String> consumerRecords = v consumer .poll(Duration. ofSeconds ( 1 )); recordsHandler .process(consumerRecords); }

@ public void process ( final ConsumerRecords<String, String> consumerRecords ) { final List<String> valueList = new ArrayList<>(); consumer Records .forEach(record -> valueList .add(record.value())); if (!valueList.isEmpty()) { try { Files. write ( path, valueList, StandardOpenOption. CREATE , StandardOpenOption. WRITE , StandardOpenOption … } catch (IOException e) { throw new RuntimeException(e); } } }

Consumers Client application Reads messages from topics Connection pooling Network protocol Horizontally and elastically scalable Maintains ordering within partitions at scale

Partition 0 Partition 1 Partition 2 Partitioned Topic Consumer A Consumer B

Partition 0 Partition 1 Partition 2 Partitioned Topic Consumer A

Partition 0 Partition 1 Partition 2 Partitioned Topic Consumer A Consumer A Consumer B

java application > configuration > dev.properties 1 group.id =movie_ratings 2 3 bootstrap.servers =localhost:29092 4 5 key.serializer =org.apache.kafka.common.serialization.StringSerializer 6 value .serializer =org.apache.kafka.common.serialization.StringSerializer 7 acks =all 8 9 #Properties below this line are specific to code in this application 10 input.topic.name =input-topic 11 output.topic.name =output-topic

Partition 0 Partition 1 Partition 2 Partitioned Topic Consumer A Consumer A Consumer B Consumer A

Hands On: Consumers

Hands On: Consumers

Ecosystem

Kafka Connect

Kafka Connect Data integration system and ecosystem Because some other systems are not Kafka External client process; does not run on brokers

Cluster Data Source Kafka Connect Kafka Connect Data Sink

Kafka Connect Data integration system and ecosystem Because some other systems are not Kafka External client process; does not run on brokers Horizontally scalable Fault tolerant

{ “connector.class”: “io.confluent.connect.elasticsearch.ElasticsearchSinkConnector” , “connector.url”: “http://elasticsearch:9200” , “tasks.max”: “1” , “topics”: “simple.elasticsearch.data” , “name”: “simple-elasticsearch-connector” , “type.name”: “doc” , “value.converter”: “org.apache.kafka.connect.json.JsonConverter ” “value.converter.schemas.enable”: “false” }

Kafka Connect Data integration system and ecosystem Because some other systems are not Kafka External client process; does not run on brokers Horizontally scalable Fault tolerant Declarative

Connectors Pluggable software component

Connectors Pluggable software component Interfaces to external system and to Kafka

Connectors Pluggable software component Interfaces to external system and to Kafka Also exist as runtime entities

Connectors Pluggable software component Interfaces to external system and to Kafka Also exist as runtime entities Source connectors act as producers

Connectors Pluggable software component Interfaces to external system and to Kafka Also exist as runtime entities Source connectors act as producers Sink connectors act as consumers

Hands On: Kafka Connect

Hands On: Kafka Connect

Confluent Schema Registry

New consumers will emerge

Schemas evolve with the business

Schema Registry Server process external to Kafka brokers

Schema Registry Server process external to Kafka brokers Maintains a database of schemas

Schema Registry Server process external to Kafka brokers Maintains a database of schemas HA deployment option available

Schema Registry Server process external to Kafka brokers Maintains a database of schemas HA deployment option available Consumer/Producer API component

Schema Registry Defines schema compatibility rules per topic

Schema Registry Defines schema compatibility rules per topic Producer API prevents incompatible messages from being produced

Schema Registry Defines schema compatibility rules per topic Producer API prevents incompatible messages from being produced Consumer API prevents incompatible messages from being consumer

Supported Formats JSON Schema Avro Protocol Buffers

Hands On: Confluent Schema Registry

Hands On: Confluent Schema Registry

Kafka Streams

Kafka Streams Functional Java API

Kafka Streams Functional Java API Filtering, grouping, aggregating, joining, and more

Kafka Streams Functional Java API Filtering, grouping, aggregating, joining, and more Scalable, fault-tolerant state management

Kafka Streams Functional Java API Filtering, grouping, aggregating, joining, and more Scalable, fault-tolerant state management Scalable computation based on consumer groups

Kafka Streams Integrates within your services as a library

Kafka Streams Integrates within your services as a library Runs in the context of your application

Kafka Streams Integrates within your services as a library Runs in the context of your application Does not require special infrastructure

Insert Code: https://youtu.be/UbNoL5tJEjc?t=443

ksqlDB

ksqlDB A database optimized for stream processing

ksqlDB A database optimized for stream processing Runs on its own scalable, fault-tolerant cluster adjacent to the Kafka cluster

ksqlDB A database optimized for stream processing Runs on its own scalable, fault-tolerant cluster adjacent to the Kafka cluster Stream processing programs written in SQL

src > main > ksql > rate-movies.sql 1 2 CREATE TABLE rated_movies AS 3 SELECT title, 4 SUM (rating)/ COUNT (rating) AS avg_rating 5 FROM ratings 6 INNER JOIN movies 7 ON ratings.movie_id=movies.movie_id 8 GROUP BY title EMIT CHANGES;

ksqlDB Command line interface REST API for application integration

ksqlDB Command line interface REST API for application integration Java library

ksqlDB Command line interface REST API for application integration Java library Kafka Connect integration

Hands On: ksqlDB

Hands On: ksqlDB

Your Apache Kafka journey begins here developer.confluent.io
Tags