Using ScyllaDB for Real-Time Write-Heavy Workloads

ScyllaDB 397 views 29 slides Jul 25, 2024
Slide 1
Slide 1 of 29
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29

About This Presentation

Keeping latencies low for highly concurrent, intensive data ingestion

ScyllaDB’s “sweet spot” is workloads over 50K operations per second that require predictably low (e.g., single-digit millisecond) latency. And its unique architecture makes it particularly valuable for the real-time write-h...


Slide Content

Using ScyllaDB for
Real-Time Write-Heavy
Workloads
Felipe Cardeneti Mendes, Technical Director, ScyllaDB
Lubos Kosco, Principal Field Engineer, ScyllaDB

+For data-intensive applications that require high
throughput and predictable low latencies
+Close-to-the-metal design takes full advantage of
modern infrastructure
+>5x higher throughput
+>20x lower latency
+>75% TCO savings
+Compatible with Apache Cassandra and Amazon
DynamoDB
+DBaaS/Cloud, Enterprise and Open Source
solutions

The Database for Gamechangers
2
“ScyllaDB stands apart...It’s the rare product
that exceeds my expectations.”
– Martin Heller, InfoWorld contributing editor and reviewer
“For 99.9% of applications, ScyllaDB delivers all the
power a customer will ever need, on workloads that other
databases can’t touch – and at a fraction of the cost of
an in-memory solution.”
– Adrian Bridgewater, Forbes senior contributor

3
+400 Gamechangers Leverage ScyllaDB
Seamless experiences
across content + devices
Digital experiences at
massive scale
Corporate fleet
management
Real-time analytics 2,000,000 SKU -commerce
management
Video recommendation
management
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M transactions/day
Uber scale, mission critical
chat & messaging app
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations

Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Global operations- Avon,
Body Shop + more
Predictable performance for
on sale surges
GPS-based exercise
tracking
Serving dynamic live
streams at scale
Powering India's top
social media platform
Personalized
advertising to players
Distribution of game
assets in Unreal Engine

Presenters

Felipe Cardeneti Mendes, Technical Director
+Puppy Lover
+Open Source Enthusiast
+ScyllaDB passionate!

Lubos Kosco, Principal Field Engineer
+Software Engineer
+?????? lover, ?????? pilot
+ScyllaDB enthusiast :-)

Agenda
+Characterizing Write-heavy workloads
+Challenges and Tradeoffs
+ScyllaDB Under Load
+Best Practices
+Success Stories

High throughput REAL-TIME data ingestion.
Characterizing
Write-Heavy Workloads

+Commonly referred to as "write-mostly"
+Workloads requiring high volume of writes under very low response times
+Challenges involve:
+Scaling writes – Price per operation, often driven by internal design decisions
+Locking – Add delays and reduce throughput
+I/O Bottlenecks – Write amplification & Crash Recovery
+Conflict Resolution – Resolving conflicts and/or Commit Protocols
+Database Backpressure – Throttling incoming load
Real-time Write-Heavy?

+?????? Internet of Things
+Often time-series workloads
+Small (but frequent!) append-only writes
+Rate determined by number of ingestion endpoints
+?????? Logging and Monitoring
+Similar to IOT, but…
+Doesn't have a fixed ingestion rate
+Not necessarily append-only, and prone to hotspots
+?????? Online Gaming
+Real-time user interactions (state, actions, messaging)
+Often spiky
+Very latency dependant
Commonly Seen Use Cases
+?????? E-commerce & Retail
+Update-heavy and Batch-y
+Inventory updates, reviews, order status & placement
+Shopping carts inherently require a read-before-write
+?????? Ad Tech and Real-time Bidding
+Bid Processing (Impressions, Auction outcomes)
+User Interaction (Clicks, Conversions, Fraud Detection)
+Audience Segmentation
+?????? Real-time Stock exchange
+High-frequency trading (HFT)
+Stock prices updates
+Order matching

What happens during a write?
Challenges and
Tradeoffs

LSM Tree
Underlying Storage Engine
B+Tree

ScyllaDB – Write Path
Memtable
Reads
Commit Log
Incoming
Writes
Storage
SStable 1
SStable 2
SStable 3
Time
SStable 4
SStable 5
SStable
1+2+3
Compaction

Payload Size

Compression Speed & Efficiency
Compression in ScyllaDB

Compression Chunk Size
Selecting Compression Chunk Sizes for ScyllaDB
+Determines the size of a compression block
+ScyllaDB block size defaults to: 4kB (SSTable), 1MB (RAID), filesystem sector size 1kB || 4kB
+Trade-off:
+Larger chunk sizes – Reduces the bandwidth used to write data
+Smaller chunk sizes – Reduces the bandwidth needed to read data
Chunk size > Partition Size
Chunk size ~= Partition Size
Use case RecommendationComments
small single keysmaller chunks close to partition size
large single keylarger chunks close to partition size
partition scans larger chunks good cache locality
mostly writes larger chunks saves write bandwidth

Compaction Strategy
+The goal of a compaction strategy is low amplification
+Read amplification – Avoid reading from too many SSTables
+Write amplification –Avoid re-writing the same data over and over again
+Space amplification – Avoid expired/deleted/overwritten data sitting on disk for too long
+If write performance is important for you…
+❌ Avoid Leveled Compaction at all costs!
Every byte has to be rewritten up to 10 times per level
4 Levels usually - potentially up to 40x write amplification

ScyllaDB Under Load
Live Optimizing (or Worsening) Write Performance

Avoiding common mishaps
Best Practices

Batching Anti-Pattern
Client App
Far too much work for the coordinator
Christopher Batey's – Misuse of unlogged batches

Prefer Individual Inserts
Client App
Fully utilize cluster processing power
Christopher Batey's – Misuse of unlogged batches

Batching – Good Pattern
Client App
All to the same partition
Christopher Batey's – Misuse of unlogged batches

Views & Global Indexes
+All writes to a base table are eventually propagated to the view table
+If the update changes a view’s key column, this mean deleting an old view row and
creating new one:



UPDATE tbl SET x=3
WHERE pk=1
DELETE FROM view
WHERE x=<old value>
INSERT INTO view (x, …)
VALUES (3, <old data>)
read-before-write

Local Indexes, CDC & Other Animals
+All writes are applied as a single mutation, however…
+It also results in write amplification as it requires yet another write



UPDATE tbl SET x=3
WHERE pk=1
INSERT INTO index (pk, x,
…) VALUES (1, 3)
INSERT INTO tbl_cdc_log (…)
VALUES (…)

Success Stories
How ScyllaDB is being used among your peers!

Zillow – Real-time Property Updates
Consuming records from different data producers: Out of Order Writes
+Why not SQL? Locking.
+Can't simply insert everything: Data bloat, higher costs
+Solution: INSERT … USING TIMESTAMP ?
“No one will even notice that we’re processing the entirety of Zillow’s property and listings data in order
to correct some data issue or change a business rule. The beauty of that is we can process the entirety
of the data at Zillow that we care about in less than a business day and, again, no performance hit to
real-time data.” – Dan Podhola, Principal Software Engineer
Zillow: Optimistic Concurrency with Write-Time Timestamps

Unleashing the Power of Data with Scylla
"We were looking for something
really specific.A highly scalable,
and highly performant NoSQL
database.
The answer was simple,
ScyllaDB is a better fit for our
use case."

João Pedro Voltani – Head of Engineering

Fanatics – Retail Operations
Use cases:
+Order capture
+Product Catalog
+Shopping Carts
+Promotions, …
“During a recent peak minute, we saw nearly 280,000 IOPs for a solid minute. With a 3-node ScyllaDB
cluster, we registered zero timeouts. Because of this we had happier customers and application
teams.” – Niraj Kothari, Director of Platforms Engineering
Read the Case Study

Poll
How much data do you have under
management of your transactional
database?

Guilherme Nogueira
[email protected]
Keep Learning
scylladb.com/category/engineering
Register now at p99conf.io
Visit our blog for
more on ScyllaDB
engineering
bit.ly/dynamodb-mc
DynamoDB
Cost Optimization
Masterclass

Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com@scylladb company/scylladb/
scylladb/