Ingesting in Rust by Armin Ronacher from Sentry

ScyllaDB 154 views 30 slides Jun 24, 2024
Slide 1
Slide 1 of 30
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30

About This Presentation

At Sentry we handle hundreds of thousands of events a second — from tiny metric to huge memory dump. What started as a small Django application had to be scaled up to support global points of presence, lower latency and significantly higher throughput. This talk walks through our experience buildi...


Slide Content

Ingesting in Rust Armin Ronacher Principal Architect at Sentry

Armin Ronacher ( he/him) Principal Architect at Sentry I created Flask I love Queues, Pipelines and everything related Away from work I juggle three kids (not literally)

Setting the Stage

Errors

Replays

Tracing

Profiling

Rust at Sentry

Why is there Rust to Begin With? Initially personal interest, unrelated to Sentry Sentry was built on Python Good FFI between Rust and Python meant extension modules were an option Allowed efficient code sharing of performance critical code Source map library in Rust, exposed to Python

Rust in the Pipeline Pieces we will cover here Relay (Ingestion / Aggregation Service) Symbolicator (Stackwalk, Symbolize, Source Maps) Python Modules (Shared Code) * * a lot of it is forward looking

Did it work?

Rust Again? Short: w ould pick it again Benefits are obvious Downsides are obvious too It might work for you, it might not

The Choice is Obvious Today

Ingestion Flow SDK Relay Sentry Envelope Event Envelope Project Config Rate Limits

Building Relay

Relay Goals What it has to do: Efficiently Enforce Quotas Geo-distributed deployments Perform Sampling Data Format Conversion and Forwarding Perform Metrics Extraction PII Stripping

Picking Rust Sentry’s language stack was Python originally We acquired some Rust experience via out CLI Built a Python extension module reusing CLI code successfully Liked the idea of reusing code across our stack

How to Relay Sentry’s /store/ endpoint was written in Python “Anything goes” (Bad for Rust) No schema, inconsistent behavior Customers sent custom data Various quirks (submit via transparent pixel) Unreasonably lenient

Relay as Library Fix up existing protocol where possible Write new protocol normalizer in Rust, expose to Python Side-by-side run old and new normalization, compare results Learnings: serde has a lot of flaws

Serde Limits Rather low recursion limits Old Python JSON code accepted Infinity/NaN Python <-> Python event submission contained lots of NaNs Workaround: pass over byte stream to replace “NaN” with “0 “ etc. Limited Data Model Serde relies on problematic in-band signalling for data model foreign types Recursion without Limits Complex recovery from errors, not enough state

In Band Signalling Large integer “42”: {"$serde_json::private::Number": "42"} https://github.com/serde-rs/serde/issues/1183 https://github.com/serde-rs/serde/issues/1463

Footgun: memmove (!?) Rust makes some expensive patterns look cheap Performance Issue #1: Huge Return Types (Large Error Types) Performance Issue #2: Accidental, Unexpected Clones

Large Error Types Boxing errors internally creates a significant performance improvement

10000% Speedup Apply offsets to N tokens, accidentally perform memmove over M bytes N times

Picking Frameworks

Relay first Steps old tokio + actix + actix-web (pre async/await) actix looked appealing: actor framework, resonates actix has no reasonable concept of backpressure management over time became a mess

Relay Today Axum Custom service layer async/await (mostly) More explicit back-pressure management

PyO3

Python Modules Export via PyO3 Rust code to Python Use maturin to create and distribute wheels Very good solution for GIL and borrow management

Thank you! Let’s connect. Armin Ronacher Principal Architect at Sentry
Tags