Unlocking value with event-driven architecture by Confluent
ConfluentInc
215 views
79 slides
Jul 18, 2024
Slide 1 of 106
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
About This Presentation
Sfrutta il potere dello streaming di dati in tempo reale e dei microservizi basati su eventi per il futuro di Sky con Confluent e Kafka®.
In questo tech talk esploreremo le potenzialità di Confluent e Apache Kafka® per rivoluzionare l'architettura aziendale e sbloccare nuove opportunità di ...
Sfrutta il potere dello streaming di dati in tempo reale e dei microservizi basati su eventi per il futuro di Sky con Confluent e Kafka®.
In questo tech talk esploreremo le potenzialità di Confluent e Apache Kafka® per rivoluzionare l'architettura aziendale e sbloccare nuove opportunità di business. Ne approfondiremo i concetti chiave, guidandoti nella creazione di applicazioni scalabili, resilienti e fruibili in tempo reale per lo streaming di dati.
Scoprirai come costruire microservizi basati su eventi con Confluent, sfruttando i vantaggi di un'architettura moderna e reattiva.
Il talk presenterà inoltre casi d'uso reali di Confluent e Kafka®, dimostrando come queste tecnologie possano ottimizzare i processi aziendali e generare valore concreto.
Size: 17.26 MB
Language: en
Added: Jul 18, 2024
Slides: 79 pages
Slide Content
Unlocking Value With Event-Driven
Architecture By Confluent
1
9:30 - 10:00
10:00 - 10:15
10:15 - 11:15
11:15 - 11:30
11:30 - 12:30
12.30 - 13:00
Welcome Coffee
SKY verso l’Event-Driven Architecture
Daniele De Angelis, Senior Specialist – Architecture - Sky Italia
Luca Cogliandro, Senior Manager – Software Engineering
Integration - Sky Italia
Sessione 1: How Event-Driven Architectures Enable Business
Value
Samuele Dell'Angelo, Senior Solutions Engineer - Confluent
Break
Sessione 2: Monoliths to Event-Driven Microservices with
Confluent
Samuele Dell'Angelo, Senior Solutions Engineer - Confluent
Networking Light Lunch
Agenda
SKY verso l’Event-Driven Architecture
Daniele De Angelis
Senior Specialist – Architecture,Sky Italia
Luca Cogliandro
Senior Manager – Software Engineering Integration, Sky Italia
3
API Exposure
Adapter GET
Composite
Service
Asyncrhonous
Dispositive
Common services
Proactive
Monitoring
(shift-left)
Adapter
POST
Fast
Information
Cloud/Paas
Infrastructure
Customer Touchpoint
(LightHouse)
Event Driven
Architecture
STEP 1
Data Driven
STEP 3
STEP 2
Guidelines
REST Api Design
Pass Through Design
Business Logic Orchestration
Choreography and Event sourcing Application Design
Error Handling and Resubmission
Logging and monitoring
Pass Through
GET & POST
(synchronous API)
Logging
Every Hermes Enterprise Framework pillar is based on a set of Enterprise and Integration Guidelines. These pillars translate into a set of Design
Patterns to be used within the framework design and implementation process
CI/CD pipelines
Event-driven architecture is defined as a software design pattern in which decoupled applications can asynchronously publish and subscribe to events via an event
broker
Event messages are published to a message broker, and interested
subscribers or services can consume these events
Participants exchange events without a centralized point of control.
Each local transaction publishes domain events that trigger local
transactions in other services.
Benefits
➢Loose coupling
➢Fault tolerance
➢Cost-effective
Challenges
➢Ordering
➢Idempotency
✔Consents
✔RTD KPIs
Benefits
➢Distributed transaction
➢Rollback operation
➢Local transactions
isolation
Challanges
➢Complexity
➢Resilient implementation
➢Observability
✔Order creation
Message broker
Service A
(Local transaction)
Service B
(Local transaction)
Service C
(Local transaction)
Msg
Input A
Msg
Input B
Msg
Input C
Msg
Output A
Msg
Output B
Msg
Output C
Event-driven Architecture based on Kafka Confluent Cloud
•Platform PaaS
•IAC CICD Automation
•Production Cluster Multi Zone
•Test Cluster Single Zone
Confluent Platform
Data Platform RTD
•Topic Name Strategy ?????? Business Domain
•Schema Registry ?????? Data Governance
Producer
Consumer
Service Exposure
•KPI
✔4 Million events produced and consumed in 10
minutes
✔Live update major KPI
Tracking Tools
Tibco Data PlatformConsents
KPI
events events
AWS
•Infrastructural
•Consumer Lag
•DLT not empty
•No events into
Topic
How Event-Driven Architectures
Enable Business Value
Samuele Dell’Angelo
Senior Solution Engineer, Confluent
7
01
02
03
04
05
06
Intro to Event Streaming: What is Event Streaming?
Data: The New Currency of Business
Why you need Data Streaming
Why Confluent
Why Data Streaming for Data Products and AI
Customer Stories and use cases
Agenda
Intro to Event Streaming
Software is Changing every Aspect of life
10
Batch processing
Siloed
Slow
Real-time stream
processing
Connected
Fast
OLD WAY NEW WAY
Today, Software Is the Business
What Are Events
Transportation
TPMS sensor in Carol’s car detected low tire-pressure at 5:11am.
Kafka
Banking
Alice sent $250 to Bob on Friday at 7:34pm.
Kafka
Retail
Sabine’s order of a Fujifilm camera was shipped at 9:10am.
Kafka
The Shift in Thinking: Event Streams
13
Rich User
experiences
Real-time
events
Real-time
Event Streams
A Sale A shipment
A Trade
Customer
Actions
Data driven
operation
Real-time
Analytics
Microservices
Stream
Processing
Data: The New Currency
of Business Success
Data is the Fuel for a Modern Business
Loyalty Rewards
Payments Processing
Real-time Trades
Fraud Detection
Personalized Recommendations
Unlocking Data and Analytics
is Much Harder than it Should Be
17
There are Two Main Locations for Data
OPERATIONAL ESTATE
Serve the needs of applications to transact
with customers in real-time
ANALYTICAL ESTATE
Support after-the-fact business analysis and
reporting for various stakeholders
OPERATIONAL ESTATE
Serve the needs of applications to transact
with customers in real-time
ANALYTICAL ESTATE
Support after-the-fact business analysis and
reporting for various stakeholders
18
As Data Needs Multiplied, Data Complexity Escalated
… creating this giant mess!
Your teams are building applications on this messy data architecture
Recommendations
Real-time payments
Branch monitoring
Real-time bot detection
Real-time alerts
Live customer chatbot
Omni-channel orders
Real-time inventory
Mobile banking
Fraud detection
Why you need Data Streaming
Data Streaming: A New Paradigm to Set Data in Motion
Continuously ingest, process, govern and share streams of data in real-time
Real-time
Data
A Transfer
A Login
A Trade
A Customer
Experience
Rich Front-End
Customer Experiences
Real-Time Backend
Operations
New Products
and Services
“We need to shift our thinking from everything at rest to everything in motion.”
From Giant Mess to Central Nervous System
DATA STREAMING PLATFORM
“For a typical Fortune 1000 company, just a 10% increase in data
accessibility will result in more than $65 million additional net income.”
Shift Your Thinking
Point-in-time queries on static data
Continuous processing at generation time
Tightly coupled point-to-point integrations
Multi-subscriber decoupled architecture
Producer-presented, governed & reusable data
Complex data wrangling downstream
Batch-based data integration
Event-driven continuous data flows
Apply a Data Product Thinking
FINANCIAL
SERVICES
RETAIL &
E-COMMERCE
MANUFACTURING HEALTHCARE TRANSPORTATION GAMING
Customers
Accounts
Web logins
Clickstream
Transactions
Payments
Branches
Claims
Trades
Customers
Accounts
Purchases
Web clickstream
Inventory
Stores
Employees
Shipments
Machine Telemetry
Inventory
Orders
Shipments
Suppliers
Assets
Equipment
Patients
Accounts
Diagnoses
Treatments
Claims
Readings
Providers
Players
Achievements
Profiles
Games
Purchases
Vehicle Telemetry
Drivers
Customers
Conditions
Deliveries
Repairs
Insurers
Chats
High Scores
Data products: Real-time, discoverable, contextualized, trustworthy and reusable
Use Data Products to Build New Business Applications
Companies that treat data like a
product can:
●Reduce the time it takes to
implement new use cases by as
much as 90%
●Decrease total ownership costs
by up to 30% (technology,
development, and maintenance)
●Reduce their risk and data
governance burden
Customers
Accounts
Web Logins
Transactions
Fraud Detection
Customers
Accounts
Inventory
Purchases
Web Clickstream
Customer Loyalty
Vehicle Telemetry
Repairs
Drivers
Customers
Delivery Management
Data Products Use Cases
Build an Effective Data Flywheel Strategy
by Reusing Data Products Across Multiple Use Cases
Supply chain systems
Watch lists
Profile mgmt
Incident mgmt
Customers
Threat vector
Customer churn
Transactions
Payments
Mainframe data
Inventory
Weather
Telemetry
IoT data
Notification engine
Payroll systems
DistributionOrder fulfillment
CRM systems
Mobile application
Personalization
Clickstreams
Customer loyalty
Change logs
Purchases
Recommendation engine
Web application
Payment service
Heatmap service
ITSM systems
Central log systems (CLS)
Fraud & SIEM systems
Alerting systems
AI/ML engines
Visualization apps
PROCESS
GOVERN
CONNECT
STREAM
Accounts
Customer
Profile
Purchases
Shipments
Claims Orders
Clickstreams
From Data Mess To Data Products
To Instant Value
Everywhere
CONNECT
Data
Systems
Inventory
Replenishment
Forecasting
…
Custom Apps &
Microservices
Personalization
Recommendation
Fraud
…
Confluent Data Streaming Platform
A new paradigm for reusable, real-time and
trustworthy data products
Why Confluent?
We are the pioneers
and experts in data
streaming
Created by the founders of Confluent
“Confluent has an asset that
competitors can't replicate: The
founders of Confluent were the
original developers of Kafka,
making it very hard to find anyone
that knows the open-source
project better than them.”
Confluent Is So Much More Than Apache Kafka
Reimagine data
streaming everywhere,
on-prem and in every
major public cloud
STREAM
Make data in motion
self-service, secure,
compliant and
trustworthy
GOVERN
Drive greater
data reuse with
always-on stream
processing
PROCESS
Make it easy to
on-ramp and off-ramp
data from existing
systems and apps
CONNECT
Operating Apache Kafka on Your Own Is Difficult
I N V E S T M E N T & T I M E
V A L U E
4
Early
Interest /
Experimentation
Central Nervous
System
Mission critical, in
Production, but
disparate LOBs
Identify a
Project
Mission-critical,
connected LOBs
As the Kafka footprint
expands across more
use cases, teams, and
environments, it creates
significant ongoing
operational burden…
one that only grows
over time, requiring
more top engineering
talent & resources
5
3
2
1
“One of the challenges with Kafka was its operational complexity, especially as the footprint expanded
across our organization. It’s a complex, distributed system, so we had to allocate a lot of our valuable
technical resources and expertise to babysit it and keep it running. This also required us to incur
additional in-house maintenance costs and risks.”
ONEROUS OPERATIONS
Free Up Talent to Move Up the Stack
Serverless
●Elastic scaling up &
down from 0 to GBps
●Auto capacity mgmt,
load balancing, and
upgrades
High Availability
●99.99% SLA
●Multi-region / AZ availability
across cloud providers
●Patches deployed in
Confluent Cloud before
Apache Kafka
Infinite Storage
●Store data cost- effectively
at any scale without
growing compute
DevOps Automation
●API-driven and/or
point-and-click ops
●Service portability &
consistency across cloud
providers and on-prem
Network Flexibility
●Public, VPC, and Private Link
●Seamlessly link across clouds
and on-prem with Cluster
Linking
“Before Confluent Cloud, when we had
an outage, developers would have to
stop their development work and
focus on operations until the problem
was sorted out, which in some cases
took up to three days. (Now) Confluent
takes care of everything for us, so our
developers can focus on building new
features and applications that add
value to the business.”
STREAM
$2.57M
Total savings
Operate 60%+ more efficiently with reduced
infrastructure costs, maintenance demands
and overhead, and downtime risk
257%
3-year ROI
Launch in months rather than years by
reducing the burden on your teams with our
fully managed cloud service
Save on Costs and Increase your ROI
Total Economic Impact of using Confluent • Forrester, March 2022 (Link)
STREAM
“We are in the business of selling and renting clothes.
We are not in the business of managing a data
streaming platform… If we had to manage everything
ourselves, I would’ve had to hire at least 10 more
people to keep the systems up and running.”
“Confluent Cloud made it possible for us to meet our
tight launch deadline with limited resources. With
event streaming as a managed service, we had no
costly hires to maintain our clusters and no worries
about 24x7 reliability."
The Confluent Advantage
Create a central nervous system for your data across your enterprise
Resilient
Infinite
Elastic
Run Consistently
Run Anywhere
Stream Governance
Stream Processing
120+ Connectors
Run Unified
Apache Kafka
®
fully managed, and
re-architected to harness the power
of the cloud
Go way beyond Kafka to build
real-time apps quickly, reliably,
and securely
Flexible deployment options across
clouds and on-premises that
sync in real-time
Cloud Native Complete Everywhere
Built and supported by the founders of Apache Kafka
®
Freedom of Choice
Committer-driven Expertise
Fully Managed Cloud Service Self-managed Software
Training Partners
Professional
Services
Enterprise
Support
End-to-end Security, Privacy, Compliance and Observability
Why Data Streaming for Data Products and AI
Connect and Share your Data Everywhere
“We are making Confluent the
true backbone of BHG, including
leveraging over 20 Confluent
connectors across both modern,
cloud-based technologies and
legacy systems, to help integrate
our critical apps and data
systems together.”
37
Cloud Cloud
On-premises
Run Anywhere Run Unified Run Consistently
Supporting any combination of on-prem, hybrid cloud and multicloud
deployments for real-time data interoperability
STREAM
Unlock Real-time Value From
Existing Systems and Applications with Ease
Pre-built Connectors
Use 120+ self-managed and 70+ fully
managed zero-code connectors for
instant connectivity from and to existing
data systems
Custom Connectors
Bring your own connector and run it
confidently and cost-effectively on our
fully managed cloud service
Native Integrations
Enjoy instant access to streaming data
directly within popular platforms, so you
never have to leave the tool of your choice
“Confluent provides exactly what we dreamed of: an ecosystem of tools to source and sink data from data
streams. It’s provided us not only with great data pipeline agility and flexibility but also a highly simplified
infrastructure that’s allowed us to reduce costs.”
Data Diode
CONNECT
Process your data once, process your data right
Process and Shape Data As It’s Generated
For Greater Reuse
39
Lower costs
Improve resource utilization and cost-effectiveness by
avoiding redundant processing across silos
Data reusability
Share consistent and reusable data streams widely with
downstream applications and systems
Data enrichment
Curate, filter, and augment data on-the-fly with additional
context to improve completeness and accuracy
Real-time everywhere
Power low-latency applications and pipelines that react to
real-time events, as and when they occur
PROCESS
Discover, Understand,
and Trust Your Data
Streams
Where did data come from?
Where is it going?
Where, when, and how was it transformed?
What’s the common taxonomy?
What is the current state of the stream?
Stream Catalog
Increase collaboration and productivity
with self-service data discovery
Stream Lineage
Understand complex data relationships
and uncover more insights
Stream Quality
Deliver trusted, high-quality event
streams to the business
“Confluent’s Stream Governance suite will play a major role in our expanded use of data in motion and creation of a
central nervous system for the enterprise. With the self-service capabilities in stream catalog and stream lineage,
we’ll be able to greatly simplify and accelerate the onboarding of new teams working with our most valuable data ."
GOVERN
Discover
Search and understand the data of interest
through a self-service user interface
Access
Request and securely grant access to data
streams through simple clicks of buttons
Build
Start working with your data in real-time to pipe it
into existing systems or develop real-time apps
Explore and Use
Real-time Data Products
with a Self-service Portal
GOVERN
Without contextualized, trusted, current data…
…LLMs can’t drive meaningful value.
What is the status of my flight to New York?
It is currently delayed by 2 hours and expected to
depart at 5 pm GMT.
Is there another flight available to the same city that
will depart and arrive sooner? What are the seating
options and cost?
Can your GenAI assistant
remember data from an earlier
conversation?
What is the source of this
information? Is this trustworthy?
Is it fresh and accurate?
How do you augment customer
data with real-time data and
process them on the fly to
provide meaningful insights?
The next available flight to New York with United
departs later but will arrive faster than your current
flight.
The only available seats in this flight are first class
window seats and costs $1,500.
Confluent enables real-time GenAI-powered
applications at scale
How?
•Integrates disparate operational data across the enterprise
in real time for reliable, trustworthy use
•Organizes unstructured enterprise data embeddings into
vector stores
•Decouples customer-facing applications from LLM call
management to provide reliable, reactive experiences that
scale horizontally
•Enables LLMs, vector stores, and embedding models to be
treated as modular components that can substituted as
technology improves
Enabling Generative AI in the Enterprise
Historic Public Data
Generative
AI Model
Intelligent
Business-Specific
Co-Pilot
User Interaction
??
Enabling Generative AI in the Enterprise
Historic Public Data
Generative
AI Model
Intelligent
Business-Specific
Co-Pilot
User Interaction
DATA STREAMING PLATFORM
Customer Stories and use cases
Data Streaming Use Cases for Media & Entertainment
INFRASTRUCTURE
USE CASES
Streaming Data
Pipelines
Streaming
Applications
Event-driven
Microservices
Real-time
AI / ML
Hybrid / Multicloud
Integration
Mainframe
Integration
Middleware /
Messaging
Modernization
Data Warehouse
Pipelining
CDC Patterns from
Systems of Records
Log
Aggregation
…. …. ….
…. and more
Shift-left Processing &
Governance
ANTI-CHEAT, CYBERSECURITY, & FRAUD
Cheat detection
Bot detection
Real time automated account bans
DDOS detection and mitigation
Suspicious login detection
Ad fraud detection
USER EXPERIENCE & ENGAGEMENT
Cross platform experiences
User engagement analysis
Live engagement
Social media integration
Next best action/offer
Alerts/Notifications
Personalized promotions and loyalty programs
Event driven marketing
Low-latency user experience
BUSINESS OPERATIONS
Content finance engineering
Budget allocation and marketing attribution
Marketing campaign analytics
Microtransaction analytics
AE to fill out slide if we have an active customer engagement. Blue font indicates existing use of Kafka / Confluent at account.
Challenge: Building a cloud-native, 5G, open RAN (ORAN) network
that supports a connectivity platform to enable app
communications from any cloud to any device.
Solution: Adopting Confluent Cloud to support networking and
automation across the hybrid cloud and create consumable data
products for developers to leverage.
Results: DISH Wireless creates data pipelines using streaming data
pipelines built with Confluent Cloud. As a result, its connectivity
platform can unlock enterprise Industry 4.0 use cases across
industries like retail, manufacturing, and automotive.
“We're going up orders of magnitude on the volume of
data that we want to transact over Confluent. By having
a partner like Confluent, we can start generating value
from our network at an unimaginable level today.” —
Brian Mengwasser, Vice President & Head of MarketPlace & Apps
Design
Industry: Telecommunications | Geo: Global | Product: Confluent
Cloud
Click here to learn more.
Challenge: Decouple a growing number of backend services,
manage large amounts of log data, speed time to innovation
for future features.
Solution: Implemented Apache Kafka
®
to connect data across
the organization and support the move to microservices.
Results:
●Reduced time to deployment
●Achieved massive scalability
●Freed developer resources
●Eliminated workflow interruption
“We work with Confluent because it is extremely
valuable to have access to experts who know the
most about Apache Kafka. It’s that feeling of security
that has paid off for us.” — Robert Christ, Principal
Engineer
Industry: Media & Entertainment | Geo: AMER | Product:
Confluent Platform
Challenge: Improving time to insights by moving away from
conventional batch-based data processing as the sole means of gaining
insights and moving toward a stream based approach.
Solution: Using streaming data—made possible by Confluent Cloud—to
power top-quality, personalized and adaptive experiences for users,
inform product roadmap, and drive growth and profitability
Results
●Agile decision-making from use of real-time data to monitor and
quickly pivot on in-product messaging, campaigns, A/B tests, and
experiments
●Adaptive bitrate streaming that ensures seamless playback
experience for viewers—without buffering
●Teams can focus on value-added tasks versus managing
infrastructure or waiting to access data
●Faster time to market for data products (from days to hours)
●Reduced TCO with Confluent Cloud vs. self-managed Kafka,
allowing three FTEs to focus on product development
●Scalability for future use cases, including content delivery
optimization and AI/ML
“With Confluent, the overarching win is that we’re able to
spend more time implementing design and better
workflows, making improvements within our current
systems, and working on product enhancements. We now
have the ability to be driven by business needs.” — Burton
Williams, Principal Data Platform Engineer
Industry: Video Technology | Geo: AMER | Product: Confluent Cloud
Click here to learn more.
Coffee Break
Monoliths to Event-Driven
Microservices with Confluent
Samuele Dell’Angelo
Senior Solution Engineer
54
Tenets for Faster Application Development
Challenges with Monoliths and Interservice
Communication
How We Do It
Confluent Cloud in a nutshell
Intro to Stream Processing and Apache Flink
Agenda
01
02
03
04
05
06
Agenda
55
2. Challenges with Monoliths
and Interservice Communication
1. Tenets for Faster Application
Development
3. How We Do It
4. Confluent Cloud in a nutshell
5. Intro to Stream Processing
and Apache Flink
Tenets for Faster Application Development
56
Eliminate
Dependencies
Reduce
Technical Debt
and TCO
Build and
Integrate
Scalable Apps
What’s Holding You Back?
57
58
Challenges with Monoliths …
Monolith
●Slow: Smallest change requires end-end testing
●Unreliable: A simple error can bring down the entire app
●Expensive: Changes to tech stack are expensive. Barrier
to adopting new programming frameworks
Services
Database
59
Microservices
●Faster release cycles: Smaller modules that can
Independently deployed and upgraded
●Improved Scalability and Reliability: Independently
scale services and recover from errors faster
●Developer autonomy: Choose the right programming
language and data store for each service
Services
Databases
Services
Database
… Have Led to the Adoption of Microservices
Monolith
●Slow: Smallest change requires end-end testing
●Unreliable: A simple error can bring down the entire app
●Expensive: Changes to tech stack are expensive. Barrier
to adopting new programming frameworks
Two Approaches
For Interservice
Communication
60
REST APIs
Synchronous request/response communication
Messaging Queues
Synchronous or asynchronous communication
Service 4
Service 3
Challenges with Using REST APIs for Interservice
Communication
61
REST APIs
Synchronous request/response
Challenges
●Slow
Tight coupling means business logic
is rewritten to add new services.
●Unreliable
Requires costly linear scaling.
Unexpected errors are difficult to
recover from.
●Expensive
High operational overhead of
managing services, at scale.
Service 2
Service 5
Service 6
Service 1
API
API
API
API
API
API
Challenges with Using Message Queues for
Interservice Communication
62
Messaging Queues
Synchronous or asynchronous
communication
Challenges
●Slow
Lack a common structure, leads
to inter-team dependencies.
●Unreliable
Cannot scale dynamically.
Ephemeral persistence.
●Expensive
Cannot easily integrate with
cloud-native apps.
Service 2
Service 5
Service 4
Service 3
Service 6
Service 1
Messaging Queues
(No common structure to share data, lacks
built-in stream processing, ephemeral
message persistence, low fault-tolerance)
MicroServices -> Event Driven Microservices
63
Why does it make sense?
Microservices
Distributed
Semi-loose with API
versioning
Requires a consistent
distributed system
Synchronous
Event Driven Microservices
Central, synchronized through events
Completely decoupled (Fire and forget)
The streaming platform reduces the
dependencies to external system
Reactive (asynchronous)
State
Coupling
E2E Testing
Execution Model
64
How We Do It:
65
How to Evolve into Event-Driven Microservices
66
CDC off legacy
database /
mainframe
Leverage Confluent’s
connector ecosystem
to easily CDC data off
legacy mainframes
and databases
Modernize legacy
MQ to event
streaming
platform
Decouple MQ from
your application by
sending data to
Confluent
Enrich events
with less code
using stream
processing
Join, summarize, and
filter event streams in
real time, reducing
the number of
microservices needed
Ensure data
compatibility with
Schema Validation
Validate data
compatibility and enable
evolution of schemas
while preserving
backwards compatibility
Today’s Pattern
Write to My Database
Then Use CDC
(Write Through)
Write to Kafka Then
Sink to My Database
Commit the Horror
That is Dual Writes
6
9
Suggested approach: Outbox + CDC Patterns
In DDD systems an Aggregate’s internal
representation of data may/should be different from
the outside contract, In that scenario, outbox pattern
can be applied.
Application writes data to its table in the database and
also to an outbox table within a single Transaction.
The outbox table is sourced to Kafka topics.
(Write Through)
7
0
Replace XA transactions for multiple database writes
●One Application needs to write to two different database storage engines where XA transactions are
not supported.
●Instead of XA transaction, the Application writes data to a single database table, which is sourced to
a Kafka topic by a Source connector. Multiple Sink connectors can be used to sink data to different
storage engines.
Suggested approach: Outbox + CDC Patterns
Transactional outbox with Confluent
71
Outbox
Orders
Confluent Platform Connect
KsqlDB Ksrteam
Relay (no code)
Stream
processing
(code)
Applications
Stream
processing
(SQL)
Event
Broker
Connect
Other
Services
7
2
Other approach: Database write aside
In its default form, this pattern guarantees dual-write for most use
cases. However, should the database transaction fail at commit time
(say, because the database server has crashed) the write to Kafka
cannot be rolled back unless transactions have been enabled. For
many use cases, this eventuality will be tolerable as the dual-write can
be retried once the failure is fixed, and most event consumers will
implement idempotence anyway. However, application programmers
need to be aware that there is no firm guarantee.
Transactional messaging systems like Kafka can be used to provide
stronger guarantees so long as all event consumers have the
transactions feature enabled.
(Database write aside)
Write to a database, then write to Kafka. Perform the write to Kafka as the last step in a database transaction to ensure
an atomic dual commit (aborting the transaction if the write to Kafka fails).
7
3
Temptation: Dual write
●Database transactions and Kafka transactions are
separate
●To perform them atomically would need to be done
as a distributed transaction
●You can use Spring Transaction Manager
○Chaining Kafka and Database Transactions
with Spring Boot: An In-Depth Look
○Examples of Kafka Transactions with Other
Transaction Managers
(Dual write)
Write to a database and to Kafka using an atomic approach.
KAFKA
DATABASES
Write
APP
Write
74
Are you safe with this approach?
75
Kafka at this moment doesn’t support 2 Phase Commit
76
Next Step
KIP-939:
Support
Participation
in Two Phase
Commit
Enables consistent and
atomic update of Kafka and
other transaction stores
3. On the Horizon
PRODUCER APP PRODUCER APP
COMMIT COMMIT
PREPARED
ABORT ABORT
Solving 2PC with KIP-939
Red Flags
InitProducerId must be called on
client restarts, it currently aborts
any ongoing transactions
Transaction.max.timeout.ms can
abort transactions independent
of the transaction coordinator
Example Flow
7
9
KAFKA
DATABASES
2
5
2
3
1
APP
4
1.Commit or Abort Previous Transaction
in Kafka (if any)
2.Start New Transactions in Kafka and
the Database; Write App Data
3.2PC Prepare Phase: Prepare Kafka
Transactions
4.Persist 2PC Decision
5.2PC Commit Phase: Commit Kafka
Transactions
FLIP-319 – Integrating with Kafka 2PC Transactions
Hardens EOS
Across Flink and Kafka
Simplifies Upgrading
Kafka Clients Used by Flink
85
Catalog
Order
Payment
On-Prem Monolith
(E-commerce App)
Load
Balancer
Client/Web
Application
Writes to
Catalog module
Other Writes
and Reads
Writes
Reads
Writes
CDC Source
Connector
Streams (Reads)
Reads from
Microservice
Partially replaces
Catalog Module
Cloud Native
Catalog Microservice
Step 2: Fully Replace Catalog Module
86
Order
Payment
On-Prem Monolith
(E-commerce App)
Load
Balancer
Client/Web
Application
Other Writes
and Reads
Writes
Reads
Writes
CDC Source +Sink
Connector
Cloud Native
Catalog Microservice
Streams (Reads)
Reads from
Microservice
Reads
Streams (Writes)
Writes to
Microservice
Third Party Consumer Apps
Challenge: eBay Korea wanted to enable continued growth
across multiple e-commerce platforms that had been in place
for more than 15 years.
Solution: eBay Korea leverages Confluent to help migrate from a
monolithic architecture to a microservices architecture that
enables more independence, greater flexibility, and improved
scalability.
Results:
●Simplified maintenance and operations for improved
productivity
●Stable, high-throughput streaming
●Rapid support
●Faster time to market
“Although we had some experience running open source Kafka
ourselves, we recognized that we needed a trusted partner to
help us resolve issues that we’d encounter with a larger-scale
deployment. Confluent not only had a local office to provide
that support, they also had the best known, verified solution,
which was what we wanted for our mission-critical
e-commerce platforms.”
— Hudson Lee, Leader of Platform Engineering
Confluent Cloud in a Nutshell
The Confluent Product Advantage
89
Re-imagined Kafka
experience for the Cloud
Be everywhere our
customers want to be
COMPLETE EVERYWHERE CLOUD NATIVE
Enable developers to
reliably & securely build
next-gen apps faster
90
Cloud-native data streaming platform built by the founders of Apache Kafka
®
KORA: THE KAFKA ENGINE FOR THE CLOUD STREAM
Everywhere
Be everywhere our
customers want to be
Cloud-native
Re-imagined Kafka
experience for the Cloud
Complete
Go above and beyond
Kafka with a complete
data streaming platform
CONNECT GOVERN PROCESS
Confluent Data Streaming Platform
Kora Architecture
We spent 5 million+ hours to re-invent Kafka into a truly cloud-native engine
NETWORK
COMPUTE
OBJECT
STORAGE
CUSTOMERS
AZ AZ AZ
Cells
Cells
Cells
Multi-Cloud Networking & Routing Tier
Metadata
Durability Audits
Data Balancing
Health Checks
Real-time
feedback
data
GLOBAL CONTROL PLANE
S
E
C
U
R
I
T
Y
THE KAFKA ENGINE FOR THE CLOUD
STREAM
RESILIENT COST EFFECTIVE
10X
FAST ELASTIC
16X30X
Using fully managed connectors is the fastest, most
efficient way to break data silos
Self-managed connector
Accelerated time-to-value • Increased developer productivity • Reduced operational burden
●Pre-built but requires manual
installation / config efforts to
set-up and deploy connectors
●Perpetual management and
maintenance of connectors that
leads to ongoing tech debt
●Risk of downtime and business
disruption due to connector /
Connect cluster related issues
Fully managed connectorCustom-built connector
●Streamlined configurations and
on-demand provisioning of your
connectors
●Eliminates operational overhead
and management complexity
with seamless scaling and load
balancing
●Reduced risk of downtime with
Confluent Cloud’s 99.95% SLA for
all your mission critical use cases
●Costly to allocate resources to
design, build, test, and maintain
non-differentiated data
integration components
●Delays time-to-value, taking up
to 3-6+ engineering months to
develop
●Perpetual management and
maintenance increases tech debt
and risk of downtime
Connect your entire business with just a few clicks
70+
fully
managed
connectors
Amazon S3
Amazon Redshift
Amazon DynamoDB
Google Cloud
Spanner
AWS Lambda
Amazon SQS Amazon Kinesis
Azure Service Bus
Azure Event Hubs
Azure Synapse
Analytics
Azure Blob
Storage
Azure Functions
Azure Data Lake
Google
BigTable
Stream Governance is provided fully managed on Confluent Cloud
Stream Quality
Deliver trusted, high-quality
data streams to the business
Stream Catalog
Increase collaboration and productivity
with self-service data discovery
Stream Lineage
Understand complex data relationships
and uncover more insights
(Schema Registry & Validation)
Discover, understand, and trust your data streams
with the industry’s only fully managed governance suite for Apache Kafka®
Stream Governance
schema registry
SCHEMA METADATA SCHEMA METADATA
Producer App
3
...
Producer App 1
Producer App
2
Consumer App
3
...
Consumer App
1
Consumer App
2
UNIVERSAL LANGUAGE & QUALITY CONTROL
Stream Quality
Deliver trusted, high-quality data streams to the business
Avro, Protocol Buffers, and JSON
serialization formats supported
validate schemas
before producing
validate schemas
before consuming
Schema Registry is available with a 99.95% Uptime SLA
Maintain trusted, compatible data streams across cloud and hybrid environments with
shared schemas that sync in real time, compatibility rules, and schema references
plugin
Intro to Stream Processing and
Apache Flink
Real-time processing
Power low-latency applications and pipelines that react to
real-time events and provide timely insights
Data reusability
Share consistent and reusable data streams widely with
downstream applications and systems
Data enrichment
Curate, filter, and augment data on-the-fly with additional
context to improve completeness, accuracy, & compliance
Efficiency
Improve resource utilization and cost-effectiveness by
avoiding redundant processing across silos
“With Confluent’s fully managed Flink offering, we can access, aggregate, and enrich data from IoT sensors,
smart cameras, and Wi-Fi analytics, to swiftly take action on potential threats in real time, such as intrusion
detection. This enables us to process sensor data as soon as the events occur, allowing for faster detection and
response to security incidents without any added operational burden.”
Effortlessly filter, join, and enrich your data streams with
Apache Flink
Blob storage
3rd party app
Databases Data Warehouse
Database
SaaS app
Low latency apps
and data pipelines
Consistent, reusable
data products
Optimized resource
utilization
Process data streams in-flight to maximize actionability,
fidelity, and portability
STREAMING DATA PIPELINES
Build streaming data
pipelines to inform
real-time decision
making
INSERT INTO enriched_reviews
SELECT id
, review
,
invoke_openai(prompt,review) as
score
FROM product_reviews
;
K
N
B
Kate
4 hours ago
This was the worst decision ever.
Nikola
1 day ago
Not bad. Could have been cheaper.
Brian
3 days ago
Amazing! Game Changer!
K
N
B
Kate
★★★★★ 4 hours ago
This was the worst decision ever.
Nikola
★★★★★ 1 day ago
Not bad. Could have been cheaper.
Brian
★★★★★ 3 days ago
Amazing! Game Changer!
The Prompt
“Score the following text on a scale of 1
and 5 where 1 is negative and 5 is
positive returning only the number”
DATA STREAMING PLATFORM
Enrich real-time data streams with Generative AI directly
from Flink SQL
Next