Fast and Deterministic Full Table Scans at Scale by Felipe Cardeneti Mendes

ScyllaDB 0 views 15 slides Oct 09, 2025

Slide 1 of 15

About This Presentation

ScyllaDB’s new tablet replication algorithm replaces static vNodes with dynamic, elastic data distribution that adapts to shifting workloads. This talk discusses how tablets enable fast, predictable full table scans by keeping operations shard-local, balancing load automatically, and scaling linea...

Size: 1.99 MB

Language: en

Added: Oct 09, 2025

Slides: 15 pages

Slide Content

A ScyllaDB Community
Fast and Deterministic Full
Table Scans at Scale
Felipe Cardeneti Mendes
Technical Director

Felipe Cardeneti Mendes (he/him)

Technical Director at ScyllaDB
■Vibe-coded the simulator for this preso
■Predictability is important
■Previous speaker on Caching :-)
■Father of a 3yo toddler (puppy)

Full Scans

Wrong Way: Single Query
■Single coordinator
■Limited parallelism
■Doesn't exploit concurrency
SELECT * FROM tbl WHERE <...>
Coordinator work

vNodes Way: Non-deterministic
Which value to pick for :X and :Y?
■Between -2^63 to +2^63 - 1
●Traverse it → done.
■Large token boundary
●Fewer requests, more data
●Higher coordination
■Smaller token boundary
●More requests, fewer data
●Less coordination

SELECT * FROM tbl WHERE token(key) >= :X AND
token(key) < :Y

vNodes Way: system.size_estimates
■Per-table estimate of replica owned vNodes
■Non-deterministic
●Prone to estimate skews
●Range size is partitionsCount * meanPartitionSize
■Used by tools like Spark and Trino
●Manual splitSize tuning
■Divides rangeSize into subranges
Token Ranges

Estimates aren't enough
■Sparse tables waste work
●Most ranges contain little or no data
> SELECT (...) FROM size_estimates WHERE keyspace_name='k' AND table_name='t'

table_name | range_start | range_end | mean_partition_size | partitions_count
-----------+----------------------+----------------------+---------------------+------------------
t | -1078580004477237357 | -966880618140703446 | 2048 | 1250
t | -1085604003861837200 | -1078580004477237357 | 32768 | 156
t | -1117853426191590572 | -1085604003861837200 | 1024 | 3200
t | -1144611754287717220 | -1117853426191590572 | 1048576 | 1000000
t | -1198247555952586603 | -1144611754287717220 | 128 | 10
(...)
■High-density tables require ﬁne-tuning
●Prone to uneven load and contention

Tablets

Recap
A
C
B
C
A
B
■Single unit of replication in ScyllaDB
■Abstraction: Smaller table "fragments"
■Span a contiguous token range
■Dynamically shrink/expand (geometric avg size)

system.tablets – A layer of indirection
■Tablets table
○Each (table, tablet) has its own token range → (node, shard) mapping
○Mapping can change independently of node addition and removal
■A new node can be added without any owned data!
○Different tables can have different tablet counts
○Tablet counts change, tablet size on disk remains relatively constant

system.tablets
Query
Replica
Set
Token
> SELECT … FROM system.tablets WHERE …;

last_token | tablet_count | replicas
----------------------+--------------+-----------------
-9079256848778919937 | 128 | [(replica1, 0), (replica2, 0)]
-8935141660703064065 | 128 | [(replica2, 1), (replica3, 1)]
-8791026472627208193 | 128 | [(replica3, 0), (replica2, 0)]
(...)

Revisited (and Simpliﬁed) Logic
■Clients parse system.tablets (retrieve existing tablet mapping)
■Tablets spanning the same replica-shards get grouped and split together
■Workers set a routingKey for requests
SELECT * FROM tbl WHERE token(key) >= :X AND
token(key) < :Y

Implementation

Multi-Stage Queues
■Coordinator
●Retrieves actual tablet → (node, shard) mapping
●Handles splitting/balancing to a corresponding shard queue
■Scheduler
●Least Loaded Replica Balancing
■Workers
●Dispatches requests
●Handle concurrency (inﬂight: 2)
●Applies backpressure

Summary
■Vibe-coded simulator: github.com/fee-mendes/tablet-fullscans
■Assumes homogeneous topologies
●Mixed-shard clusters left as an exercise :)
■Mimics jitter (sleeps)
■Relatively constant time

Thank you! Let’s connect.
Felipe Mendes
[email protected]
@felipemendes.dev
scylladb.com

Fast and Deterministic Full Table Scans at Scale by Felipe Cardeneti Mendes

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Fast and Deterministic Full Table Scans at Scale by Felipe Cardeneti Mendes

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx