As Fast as Possible, But Not Faster: ScyllaDB Flow Control by Nadav Har'El

ScyllaDB 0 views 24 slides Oct 14, 2025

Slide 1 of 24

About This Presentation

Pushing requests faster than a system can handle results in rapidly growing queues. If unchecked, it risks depleting memory and system stability. This talk discusses how we engineered ScyllaDB’s flow control for high volume ingestions, allowing it to throttle over-eager clients to exactly the righ...

Size: 3.43 MB

Language: en

Added: Oct 14, 2025

Slides: 24 pages

Slide Content

A ScyllaDB Community
As Fast As Possible, But Not Faster:
ScyllaDB Flow Control
Nadav Har’El
Distinguished Engineer

Nadav Har’El

Distinguished Engineer at ScyllaDB
■Before Scylla, I wrote a Hebrew spell-checker, a
kernel from scratch, and other eclectic stuff.
■My favorite percentile is 99.
■I’m a mathematician in theory, programmer in
practice.
■I have three children (and also a ?????? and a ?????? ).
October 2025

INTRODUCTION

AND GOALS OF THIS TALK

Ingestion
■We look into ingestion of data into a Scylla cluster,
i.e., a large volume of write requests.
■We would like the ingestion to proceed as fast as
possible, but not faster.

“as fast as possible, but not faster”?

■An over-eager client may send writes faster than the
cluster can complete earlier requests.
■We can absorb a short burst of requests in queues.
■
Allow the client to
continue writing at
excessive rate

Backlog of uncompleted
writes grows until
memory runs out!

Goals of this talk

Understand the
■different causes of queue buildup during writes.
■ﬂow control mechanisms Scylla uses to automatically
cause ingestion to proceed at the optimal pace.
Simulator for experimenting with different workloads and
ﬂow control mechanisms.
https://github.com/nyh/ﬂowsim

Dealing with an over-eager writer

■As the backlog grows, server needs a way to tell the client
to slow down its request rate.
■The CQL protocol does not offer explicit ﬂow-control
mechanisms for the server to slow down a client.
■Two options remain: delaying replies to the client’s
requests, and failing requests.
■Which we can use depends on what drives the workload:

Workload model 1: Fixed concurrency

■Batch workload: Application wishes to write a large batch of
data, as fast as possible (driving the server at 100% utilization).
■A ﬁxed number of client threads, each running a request loop:
prepare some data, make a write request, waiting for response.
■The server can control the request rate by throttling (delaying)
its replies:
●If the server only sends N replies per second, the client will
only send N new requests per second!

Workload model 2: Unbounded concurrency

■Interactive workload: The client sends requests driven by some
external events (e.g., activity of real users).
●Request rate unrelated to completion of previous requests.
●If request rate is above cluster capacity, the server can’t slow down
these requests and the backlog grows and grows.
●To avoid that, we must fail some requests.
■To reach optimum throughput, we should fail fresh client
requests. Admission control.

FLOW CONTROL OF
REGULAR WRITES

Regular writes (i.e., no materialized view)

A coordinator node receives a write and:
■Sends it to to RF (e.g., 3) replicas.
■Waits for CL (e.g., 2) of those writes to complete,
■Then reply to the client: “desired consistency-level achieved”.
■The remaining (e.g., 1) writes to replicas will continue
“in the background” without the client waiting.

Why background writes cause problems

■A batch workload, upon receiving the server’s reply, will send
a new request before these background writes ﬁnish.
■If new writes come in faster than we complete background
writes, the number of these background writes can grow
without bound.

Typical cause: a slower node

■3 nodes. One slightly slower:

■RF=3, CL=2.
■Each second:
●10,000 writes are replied when the two faster nodes respond.
●Backlog of background writes to slowest node increases by 100.
10,000 wps10,000 wps9,900 wps
(1% slower)

Regular writes - Scylla’s solution

A simple, but effective, throttling mechanism:
■When total memory used by background writes exceeds some
limit (10% of shard’s memory), the coordinator will only reply after
all RF replica writes have completed. (not after CL).
■The backlog of background writes does not continue to grow.
■Replies are only sent at the rate we can complete all the work, so a
batch workload will slow down its requests to the same rate.

Simulation of “slower node” example

max backlog = 300

FLOW CONTROL OF
MATERIALIZED VIEWS WRITES

Write to a table with materialized views

■As before: coordinator sends writes to RF (e.g., 3) replicas,
waits for only ﬁrst CL (e.g., 2) of those writes to complete.
■Each replica later sends updates to view table(s) -
The client does not wait for these view updates.
●A deliberate, though often-debated, design choice.

Why view writes cause problems

■Again the problem is background work the client doesn’t await.
■There may be a lot such work - with V views, we typically have
2*V of these background view writes .
■If new writes come in faster than we can ﬁnish view updates,
the number of these queued view updates can grow without
bound.

Simulation (problem)

pending view writes
pending base writes
request rate

View updates - solution

■How to prevent view backlog from growing endlessly?
■Delay each client write by an extra delay.
●Higher delay: slows client, view-update backlog declines.
●Lower delay: speeds-up client, view-update backlog increases.
■Goal: a controller, changing delay as backlog changes, to
keep backlog in desired range.

View updates - solution

■A simple linear controller:
●delay = α * backlog
■When client concurrency is ﬁxed, will converge on delay
that keeps backlog constant:
●If delay is higher, client slows and backlog declines,
causing delay to go down.

Simulation (solution)

pending base writes
pending view writes
request rate

Conclusions

■Scylla ﬂow-controls a batch workload,
●data is ingested as fast as possible - but not faster than that.
■When client cannot be slowed down to the optimal rate,
●Scylla starts dropping new requests, to achieve the highest
possible throughput of successful writes.
■Automatic, no need for user intervention or conﬁguration.

As Fast as Possible, But Not Faster: ScyllaDB Flow Control by Nadav Har'El

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

As Fast as Possible, But Not Faster: ScyllaDB Flow Control by Nadav Har&#39;El

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx

As Fast as Possible, But Not Faster: ScyllaDB Flow Control by Nadav Har'El