How We Boosted ScyllaDB Data Streaming by 25x by Asias He

ScyllaDB 75 views 9 slides Mar 05, 2025
Slide 1
Slide 1 of 9
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9

About This Presentation

Streaming, the process of scaling out/in to other nodes used to analyze every partition, one-by-one and was too slow and depended on the schema. File based stream is a new feature that optimizes tablet movement significantly. It streams the entire SSTable files without deserializing SSTable files in...


Slide Content

A ScyllaDB Community
How We Boosted ScyllaDB
Data Streaming by 25x
Asias He
Principal Software Engineer

Asias He
■ Asias He is a long-time open source developer who
previously worked on Debian Project, Solaris Kernel,
KVM Virtualization for Linux and OSv unikernel. He
now works on Seastar and ScyllaDB.

■What is streaming
■Mutation based streaming
■File based streaming
■Performance improvement
Agenda

Streaming is a low-level mechanism to move data
between nodes for multiple operations



What is streaming in Scylla
●Add node
●Remove node
●Migrate tablets
●Rebuild tablets
●…

Sender Node
What is mutation based streaming
SSTable
SSTable
SSTable
SSTable
SSTable
SSTable
Reader
Mutations
Serialize
Network
Deserialize
Mutations
Receiver Node
SSTable
SSTable
SSTable
SSTable
SSTable
SSTable
Writer

Sender Node
What is file based streaming
SSTable
SSTable
SSTable
SSTable
SSTable
SSTable
Reader
File
Stream
Network
File
Stream
Receiver Node
SSTable
SSTable
SSTable
SSTable
SSTable
SSTable
Writer
New!

Benefits of file based streaming
■Less cpu consumption
■No more processing of individual mutations
■No more serialization work of mutations
■Less network consumption
■SSTable format is more compact than mutations

Tablet migration test with mutation and file based streaming


Performance
Time to finish Stream bandwidth Bytes on wire per
tablet
CPU Load
Mutation based
streaming
3003 seconds 100 MB/s 20090 MB 12%
File based
streaming
116 seconds 1000 MB/s 7280 MB 4%
Difference 25X 10X 2.75X 3X
●3 Scylla nodes i4i.2xlarge
●3 Loaders t3.2xlarge
●1 Billion partitions

Stay in Touch
Asias He
[email protected]
@asias
Thanks!
Tags