ScyllaDB’s Change Data Capture (CDC) allows you to stream both the current state as well as a history of all changes made to your ScyllaDB tables. In this talk, Senior Solution Architect Guilherme Nogueira will discuss how CDC can be used to enable Real-time Event Processing Systems, and explore a...
ScyllaDB’s Change Data Capture (CDC) allows you to stream both the current state as well as a history of all changes made to your ScyllaDB tables. In this talk, Senior Solution Architect Guilherme Nogueira will discuss how CDC can be used to enable Real-time Event Processing Systems, and explore a wide-range of integrations and distinct operations (such as Deltas, Pre-Images and Post-Images) for you to get started with it.
Size: 1.99 MB
Language: en
Added: Jun 20, 2024
Slides: 17 pages
Slide Content
Real-Time Event Processing with CDC Guilherme Nogueira, Senior Solutions Architect
Guilherme Nogueira Senior Solutions Architect @ ScyllaDB Previously at IBM Loves all things open source Just got a new puppy!
What is CDC? Use cases Comparison Record types Consuming CDC data Presentation Agenda
What is CDC?
Change Data Capture – CDC Consumable m odification record for one or more tables Key Feature of ScyllaDB (GA'd in 4.3/2021.1) Constantly receiving improvements Capture changes (write/delete/updates) Each change can trigger an event Asynchronously readable by a Consumer Unified for both CQL and DynamoDB Streams
Use Cases
Where and How is CDC Used Database Replication (Elasticsearch) Notification Systems External Cache Invalidation In-flight Analytics Such as Fraud Detection Downstream Application Triggers
Comparison
How does ScyllaDB CDC Compares Against ... Cassandra DynamoDB MongoDB ScyllaDB Consumer location on-node off-node off-node off-node Replication duplicated deduplicated deduplicated deduplicated Deltas yes limited partial optional Pre-image no yes no optional Post-image no yes yes optional Slow consumer reaction Table stopped Consumer loses data Consumer loses data Consumer loses data Ordering no yes yes yes
Record Types
What do I Get Out of CDC? Delta Preimage Postimage 'full' : contain information about every modified column 'keys' : only the primary key of the change will be recorded 'false' : Disables the feature 'true' : contain only the columns that were changed by the write ‘full’: contain the entire row (how it was before the write was made) 'false' : Disables the feature 'true' : show the affected row’s state after the write. Postimage row always contains all the columns no matter if they were affected by the change or not What was changed? What was before? What’s the end result?
Consuming CDC Data
How to Consume CDC Data CDC data is available through normal CQL Easy to read raw streams Already de-duplicated All delta and pre image values are normal CQL data Can consume without knowledge of server internals Layered approach CDC core functionality relatively simple. Allows for more sophisticated adaptors Push models etc.
Easy to integrate and consume Plain tables Robust Replicated in same way as normal data Benefits from all read path improvements Reasonable overhead Coalesced writes to same replica ranges Overhead is comparable to adding another table Does not overflow if consumer fails to act Data is TTL'd Why CDC on ScyllaDB?
Stay in Touch Guilherme Nogueira [email protected] hopugop https://www.linkedin.com/in/guilherme-nogueira-4740a116/