STREAMING WITH KAFKA Publish/Subscribe Messaging with Kafka

GravenGuan 5 views 7 slides Apr 14, 2024

Slide 1 of 7

About This Presentation

So far we’ve really just talked about processing historical, existing big data
– Sitting on HDFS
– Sitting in a database
■ But how does new data get into your cluster? Especially if it’s “Big data”?
– New log entries from your web servers
– New sensor data from your IoT system
– ...

Size: 345.25 KB

Language: en

Added: Apr 14, 2024

Slides: 7 pages

Slide Content

STREAMING WITH
KAFKA
Publish/Subscribe Messaging with Kafka

What is streaming?
■So far we’ve really just talked about processing historical, existing big data
–Sitting on HDFS
–Sitting in a database
■But how does new data get into your cluster? Especially if it’s “Big data”?
–New log entries from your web servers
–New sensor data from your IoTsystem
–New stock trades
■Streaming lets you publish this data, in real time, to your cluster.
–And you can even process it in real time as it comes in!

Two problems
■How to get data from many different sources flowing into your cluster
■Processing it when it gets there
■First, let’s focus on the first problem

Enter Kafka
■Kafka is a general-purpose publish/subscribe messaging system
■Kafka servers store all incoming messages from publishers for some period of
time, and publishesthem to a stream of data called a topic.
■Kafka consumerssubscribe to one or more topics, and receive data as it’s
published
■A stream / topic can have many different consumers, all with their own
position in the stream maintained
■It’s not just for Hadoop

Kafka architecture
Kafka Cluster
App App App
App
App
App App App
DB
DB
Producers
Consumers
Stream
Processors
Connectors

How Kafka scales
Image: kafka.apache.org
■Kafka itself may be distributed among
many processes on many servers
–Will distribute the storage of stream
data as well
■Consumers may also be distributed
–Consumers of the same group will
have messages distributed amongst
them
–Consumers of different groups will get
their own copy of each message

Let’s play
■Start Kafka on our sandbox
■Set up a topic
–Publish some data to it, and watch it get consumed
■Set up a file connector
–Monitor a log file and publish additions to it

STREAMING WITH KAFKA Publish/Subscribe Messaging with Kafka

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

STREAMING WITH KAFKA Publish/Subscribe Messaging with Kafka

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx