Cassandra vs. ScyllaDB: Evolutionary Differences

Felipe Mendes, Technical Director at ScyllaDB
Guilherme Nogueira, Technical Director at ScyllaDB

Cassandra vs. ScyllaDB
Evolutionary Differences

Introductions
Felipe Mendes, Technical Director at ScyllaDB
+Published Author on Linux and Databases
+Helps teams solve their most challenging problems
+Years of experience with Linux and distributed systems

Guilherme Nogueira, Technical Director at ScyllaDB
+Previously Solutions Architect
+Publishing
+Streaming
+Automotive

Poll
How experienced are you with ScyllaDB/Cassandra?

Similarities and
Differences

ScyllaDB & Cassandra – Similarities
Distributed Peer-to-Peer Automatic Sharding Global Replication Cassandra Query Language
Wide-Column Compatible Ecosystem Anti-Entropy LSM Engine

Beyond Cassandra
Tablets Raft
Workload
Prioritization
Repair-based
Operations
Incremental
Compaction
SSTable
SSTable
Low
Ampliﬁcation
Row-based
Cache
DynamoDB
Compatibility
Intriguing ScyllaDB Capabilities You Might Have Overlooked
Concurrency and
Rate-limiters

Tablets
A
C
B
C
A
B
+Abstraction: Smaller table "fragments"
+Span a contiguous token range
+Dynamically shrink/expand (geometric avg size)
+Migrated as a single unit

RAFT for metadata
■Strongly consistent
system.token_metadata

node A
bootstrap
bootstrap
node B
node C
Read
barrier
Read
barrier
■Fault tolerant storage

Workload Prioritization
No prioritization Workload Prioritization

Repair? Tombstones? Data Resurrection?
+Worst things a database can do:
+Lose data
+Corrupt data
+Resurrect data
+Not a problem with ScyllaDB
+We take your data seriously
+We know repair is painful
Faster, Safer Node Operations with Repair vs Streaming

Beyond Cassandra
Tablets Raft
Workload
Prioritization
Repair-based
Operations
Incremental
Compaction
SSTable
SSTable
Low
Ampliﬁcation
Row-based
Cache
DynamoDB
Compatibility
Intriguing ScyllaDB Capabilities You Might Have Overlooked
Concurrency and
Rate-limiters

Incremental Compaction
A
B
...
Z
a
b
...
z
A+a
B+b
Aa
Bb
A+a
B+b
+We observed problems with legacy compaction strategies:
+STCS has high space ampliﬁcation (and low write ampliﬁcation)
+LCS has high write ampliﬁcation (and low space ampliﬁcation)
+We wanted to beneﬁt from both approaches
+By borrowing SSTable Runs from LCS
+And applying them over size-tiers
+Merely replacing
+increasingly larger SSTables with
+increasingly longer SSTable Runs
Designing Access Methods: The RUM Conjecture

+ScyllaDB has a fast cache
+Eﬃcient access & maintenance
+Thanks to collocation with replica and design
+Takes care of consistency guarantees
+Handles complexities of data and query model
Row-based Cache
memtable
RAM
Disk
Read
cache
sstable
sstable
sstable
We Compared ScyllaDB and Memcached and… We Lost?

+Run DynamoDB-compatible
workloads anywhere:
+on AWS
+on Google Cloud, Azure, or
+On-prem
+DynamoDB Streams, Global Tables
+Supports Load Balancing
+ScyllaDB Spark Migrator to move
data anywhere
DynamoDB-compatible
API (Alternator)
+Cassandra has no comparable feature

Per-Partition Rate-Limiting
Retaining Goodput with Query Rate Limiting
■Malicious/misbehaving users
■Parts of your system going awry due to bugs

The system does not have to satisfy these requests, and they should not affect the whole system too much.
■A maximum read/write rate can be set for a table.
■ScyllaDB will reject some operations in an effort to
keep the rate of successful requests under the limit.

ALTER TABLE ks.tbl
WITH per_partition_rate_limit = {
'max_writes_per_second': 100,
'max_reads_per_second': 200
};

Poll
How large are your clusters?

Comparing Performance

Setup
+DB Nodes
+3x AWS i4i.4xlarge (Cassandra 5.0.2, ScyllaDB 2024.2)
+16vCPU, 128GB RAM per node
+1.5TB used (~45%),
+Schema: Blob(key<10>, c0<200>, c1<200>, c2<200>, c3<200>, c4<200>)
+RF=3
+LOCAL_QUORUM
+Loader
+AWS c6in.8xlarge – Rust Latte
+Implied scheduling – see (pkolaczk/latte#120)
cassandra_latest.yml

Where do we start?
+Multiple iterations/settings
+Pick the best and carry out remaining tests

Thread Pools
+Quite a pain to ﬁne-tune
+Single funnel
+Writes and Reads won't scale independently

Cache Workload
+Key cache: 2G
+Row cache: 51G
+Bummer: To use or not? :-(

Hot/Cold Overwrites
+Hot set: 40M rows
+Cold set: Remainder
+64% hot reads, 16% hot writes – 16% cold reads, 4% cold writes

Scaling

+ScyllaDB – Decouples topology changes from streaming
+Add nodes with time ~ 0
+Streaming happens in parallel
+Load gradually shifts, via tablet-aware drivers
+Cassandra – topology rely on streaming
+You add/remove a single node, and wait
+Then another, and wait…
+Time grows incrementally
Differences

+Cassandra 4.0 docs say:
+"To run any Zero Copy streaming benchmark the
stream_throughput_outbound_megabits_per_sec must be set to a really high value"
+Cassandra 5 docs say:
+"To run any Zero Copy streaming benchmark the stream_throughput_outbound must be set to a
really high value"
+Instaclustr says:
+entire_sstable_stream_throughput_outbound!
+Hint: Instaclustr is correct.
What Happened?

... and then you’ve got to Cleanup
+Not needed for ScyllaDB
+Boom!

Bootstrap
Bootstrap Cleanup

Oh! By the way...
+Our Cassandra cluster got inconsistent :-(
+How to benchmark this?
+Fixed after a rolling restart
+Quite annoying

Demo time

●Starting from 3 x i4i.4xlarge
○2TB pre-replication dataset, RF=3
○~56K ops/s for Cassandra
○~200K ops/s for ScyllaDB
●Scaling to:
○ScyllaDB: + 3 x i4i.32xlarge
○Cassandra: + 69 x i4i.4xlarge
Scaling to tackle 2M ops/s

Bootstrap
Bootstrap Cleanup
Scaling to tackle 2M ops/s
Bootstrap
Bootstrap Cleanup
26x
faster

Bootstrap
Bootstrap Cleanup
Cassandra scaling time
< 300GB transferred,
becomes linear

Cassandra node join process

ScyllaDB Scaling
●Process starts instantly and joins the cluster
●Load balancer continuously distribute tablets and load
●Client drivers are notiﬁed and route request according to tablet's movement

Costs

+Throughput
+Spiky and bounded – Batch, ETL
+ScyllaDB offers unparalleled throughput
+Latency sensitive
+Focus of our testing – Real-time and unpredictable
+ScyllaDB reacts faster to opportunities
+Storage dense – Tablets + Advanced Compression allow for up to 90% disk utilization
+Dictionary-based compression
+Data governance / Retention requirements
+ScyllaDB maximizes both disks and cache
Different savings for different workloads

Run df on your
Cassandra nodes
for a SURPRISE

Observability

Cassandra - DIY, Community or 3rd Party
+No centralized offer for monitoring
+Community versions available (Metrics Collector for Apache Cassandra)
+Lacks 5.0 support
+Each team comes up with their own set of dashboards and alerts
+JMX complexity
+Newer versions expose some metrics in a Prometheus friendly format bypassing JMX
+Still largely needed for Java metrics, maintenance operations
+Monitoring-as-a-Service varies in coverage and detail
+Datadog, AxiomOps, Dynatrace, New Relic

ScyllaDB - Out of the box monitoring
+Easy, out of the box with Scylla Monitoring stack
+Prometheus + collection rules and alerts
+Loki + alerts
+Alert Manager + alerting rules
+Grafana + powerful dashboards
+New release = new features = new dashboards
+Stay up-to-date for the latest and greatest

Poll
How do you monitor your clusters?
Which observability tools you use, whether custom-built or 3rd party

Operational
Simplicity

ScyllaDB setup and tuning
+sysctl ✅ automatic
+scylla_setup
+memory, scheduling, network
+disks parameters ✅ automatic
+iotune, part of scylla_setup
+best concurrency settings to maximize disk utilization
+hardware interrupts handling ✅ automatic
+dedicated vCPUs to handle hardware interrupts via irqbalance
+allows shards to run without interferences
+jvm ✅ absent
+scylla.yaml ✅ simple changes
+usually just customized for enabling features (Alternator, encryption settings)

Repairs, backups
ScyllaDB Manager

Backup
Restore
Repair
Backup/restore
Medusa / K8ssandra

Repairs
Reaper

Vector
functionality

Vector
●Both implement VECTOR<ﬂoat, dim> type
●Approximate Nearest Neighbour (ANN) queries
●Similar CQL syntax SELECT … ORDER BY col ANN OF vector LIMIT K

However, upon a closer look…

Vector - Cassandra implementation
●Implemented using Storage-Attached Index (SAI) and JVector
○Shared SSTable and compaction lifecycles
●Built on Indexes
○Susceptible to same issues
○Data locality, large partitions
●Shared data paths (storage, chunk cache)

Vector - ScyllaDB Cloud implementation
●External service
●In-memory data
●Rust-based service
●Leverages USearch library
●Fully-managed in the Cloud

Vector - ScyllaDB Cloud implementation

And back to
our demo

Wrap Up

+Benchmarks are complicated
+Be wary of sustained latencies on Apache Cassandra
+Measure sustained response times
+Our testing has limitations, it is impossible to test everything

+ScyllaDB outperforms Apache Cassandra 5.0 in every aspect
+Performance, Scaling, Costs
+Admin (Tip: Check out how the process to upgrade to C*5 looks like ;-)

+Plus Workload Prioritization, Alternator, frictionless monitoring, no GC, …
+Both databases evolved on their own paths
+ScyllaDB focused on maintaining high performance, scalability and vector features - all the while lowering costs
+Cassandra is built for commodity, aiming at a general purpose noSQL with use-cases with broader latency tolerance
Summary

Guilherme Nogueira
[email protected]
Keep Learning
Fast Scaling.
Max Eﬃciency. Lower Cost.
8 AM PT - 10 AM PT | 15:00 GMT - 17:00 GMT
ScyllaDB
X Cloud
ScyllaDB
University Live
LIVE LEARNNG
November 12
ONLINE | OCT 22 + 23, 2025
All Things Performance
p99conf.io
scylladb.com/events

Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com@scylladb company/scylladb/
scylladb/

Cassandra vs. ScyllaDB: Evolutionary Differences

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Cassandra vs. ScyllaDB: Evolutionary Differences

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Slide 47

Slide 48

Slide 49

Slide 50

Slide 51

Slide 52

Slide 53

Slide 54

Slide 55

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx