Timeseries Storage at Ludicrous Speed by Duarte Nunes

ScyllaDB 2 views 21 slides Oct 09, 2025
Slide 1
Slide 1 of 21
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21

About This Presentation

Datadog’s real-time storage system for timeseries data ingests billions of points per second and serves thousands of queries per second with sub-second latencies. This session will describe how the system is designed, how it has evolved over time, and the techniques we use today to meet our perfor...


Slide Content

A ScyllaDB Community
Timeseries storage at
ludicrous speed
Duarte Nunes
Staff Engineer

Duarte Nunes (he/him)

Staff Engineer at Datadog
■Background in distributed systems
■Rewrote way too many bespoke databases in Rust
■Into landscape/wildlife photography and
letterpressed books

Billions datapoints / second
Millions of hosts

High-level overview of the Metrics Platform

An RTDB cluster

Inside an RTDB node

M nocle

Data Model: (org, metric, hash(tags)) -> [(timestamp, value)]

Thread-per-coreish

A Monocle worker instance

Memtable

A Monocle file set

Files & Compaction

Cache

Throttling queries under load
Gated queries and cost-based scheduling
■Gates allow admission based on:
●Ingestion lag
●Available memory
●Overall concurrency
■Cost-based scheduling
●Uses CoDel to manage latency
●Queries subject to timeout

Partitioning
By partitioning on a hash, queries are executed by all workers
■Parallelism is great?
●Amdahl's law
●Head of line blocking

Rust
We’ve had lots of success building on Rust and Tokio
■Intermediate schedulers woes (e.g., FuturesUnordered)
●Buffering
●Yielding
■Future cancellation is hard to reason about

Looking ahead

Smarter Routing
Move to a more dynamic load-balancing system
■Current routing is inspired by Google’s Slicer
●Balances PPS
●Slow to adjust
■Data movement to adapt to query pattern changes and bursts

Collocating Points and Tags
Single realtime database, stop indexing everything
■Columnar data
●Read only what’s being queried
●Series-per-row
●Great for SIMD
●Compression techniques that allow random access to the data (e.g.
FSST)
■No more thread-per-core?

Thank you! Let’s connect.
Duarte Nunes
[email protected]
Tags