Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Across Event Streams
ScyllaDB
68 views
15 slides
May 15, 2024
Slide 1 of 15
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
About This Presentation
We start by setting up a common ground introducing why relational databases fall short, addressing common EDA characteristics such as the need for real-time response times and schemaless approaches to address recurring changes to adapt and on-board new use cases. Next, interact with a sample Rust-ba...
We start by setting up a common ground introducing why relational databases fall short, addressing common EDA characteristics such as the need for real-time response times and schemaless approaches to address recurring changes to adapt and on-board new use cases. Next, interact with a sample Rust-based application: a social network app demonstrating an integration of both ScyllaDB and Redpanda.
Size: 9.24 MB
Language: en
Added: May 15, 2024
Slides: 15 pages
Slide Content
Integrating Distributed Data Stores Across Event Streams Felipe Cardeneti Mendes
‹#› Solution Architect at ScyllaDB Published Author Linux and Open Source enthusiast Felipe Cardeneti Mendes
Complexity Exist in Both Sides… But for Different Reasons ‹#› Highly Encouraged: Martin Kleppmann — Event Sourcing and Stream Processing at Scale
PageView Event Example Track whenever a page gets viewed… So what? Who viewed your profile? People also viewed … Reporting Relevance Models (ie: Result Ranking) Metrics ‹#› From Martin Kleppmann — Event Sourcing and Stream Processing at Scale
Ok… so why do we need ScyllaDB then? This is where we (slightly) diverge from Martin: ‹#› From Martin Kleppmann — Event Sourcing and Stream Processing at Scale Super slow – But doesn't have to be!
So you're telling me to ditch Stream Processing? Well… No. Although you could . Data and Domain dependent Full-text searches; Ad-hoc querying; Joins; Stateless vs Stateful Processing Keep doing Stateless (or semi) as you know; Re-think your Stateful Processing strategy We are (almost) a decade past 2016 ‹#›
Tracking Events Each event is uniquely identified Great for even distribution and performance Auditing, history of interactions, power Batch Analytics Bad for aggregations (including sliding windows) – Blocker for Real-time analytics Potential workaround: Map event to entity+type, cluster by timestamp Actual data model: ‹#› CREATE TABLE IF NOT EXISTS ks.events ( id uuid, ts timestamp, event_type text, src_page text, PRIMARY KEY(id, event_type, ts) )
Counter Tables (Post Likes, Profile Views, Page Hits) Well… Used for counting things :-) Goods: Simple Highly performant (no aggregations needed, hooray!) Problem: Misses context Who viewed a profile? Which pages were popular for users within a given region? Was a post liked by similar users? Hence the importance of defining your events table upfront Actual data model: ‹#› CREATE TABLE IF NOT EXISTS ks.post_likes ( post_id uuid PRIMARY KEY, count counter )
Relationships (Follow and Followers) ‹#› Easy done (and consistent): If Y is followed by X, then X follows Y Simply materialize keys in the opposite direction Though admittedly takes a while to get the gist of it Actual (base) data model: CREATE TABLE IF NOT EXISTS ks.follows ( id uuid, follower uuid, ts timestamp, PRIMARY KEY( id , follower ) ) Swap the two
Improvement Thoughts ‹#› Consistency guarantees – Apply common patterns such as: Outbox – For non-idempotent operations ( DON'T overuse it ) Listen to Yourself Publish database events (CDC): Strongly consistent Push notifications! Feed other downstream app/services (ie: Support full-text, ad-hoc, etc) WebSockets for real-time communication / async callbacks? You name it! ;-)
User Extreme (and Inspiring) Stories ‹#› How Numberly Replaced Kafka with Rust + ScyllaDB How Palo Alto Networks Replaced Kafka with ScyllaDB From 1M to 1B Features Per Second: Scaling ShareChat’s ML Feature Store Elasticity vs. State? Exploring Kafka Streams Cassandra State Store
Keep in touch ! Felipe Cardeneti Mendes Solution Architect ScyllaDB [email protected] Find me on LinkedIn !