Writing Low Latency Database Applications Even If Your Code Sucks

ScyllaDB 356 views 21 slides Jun 26, 2024
Slide 1
Slide 1 of 21
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21

About This Presentation

All latency lovers are used to the mystical experience of joy coming from aligning data to a cache line size and seeing your latency improve by hundreds of microseconds. But as transcendent as we can become, the laws of physics still apply to us.

That means that there is little we can do when data ...


Slide Content

Writing Low Latency Database Applications Even if your code sucks! Glauber Costa CEO at Turso

Glauber Costa CEO at Turso COMPANY LOGO Former VP Field Engineering at ScyllaDB Former contributor to the Linux Kernel Latency aficionado When I am not working, I take care of my 3 kids

Your application and your database How latency is commonly measured

Zoom out Is that the end-to-end latency?

Zoom out Is that the end-to-end latency? What do you mean “there are other countries” ???

Impact P99 goes from 10ms to 1ms. That’s 10x better. Roundtrip NYC to Sydney: ~250ms

Impact P99 goes from 10ms to 1ms: That’s 10x better. Roundtrip NYC to Sydney: ~250ms. Barbie not impressed. I don’t care about your 10ms savings, mate!

Solution 1 Write good code, but move your data closer to Sydney

Solution 2 Put your data everywhere. Just do whatever you want and save yourself 200ms.

Some estimates (approx) NYC to: Sydney 250ms Sao Paulo 150ms San Francisco 70ms Amsterdam 70ms Tokyo 180ms https://wondernetwork.com/pings/New%20York

The Edge Global coverage. Geographical Abstraction. Serverless: Cloudflare Workers, Vercel Edge, Netlify Edge, etc. Or VMs: Fly.io, Koyeb, etc.

The Edge Still bad!!

How? For per-user data: partitioning . Unfortunately for global data, there’s only one way: replication . But the problem with replication is: cost .

Reframing the problem How to build the cheapest possible database, so that it can replicate to a lot of locations? What is a lot? 30, not 3. Seasonality of traffic. Cost of a GB: ~$0.10/month Cost of a CPU (with reasonable memory): ~$20/month

What is the cheapest SQL database ? Hey, can we change SQLite to add the building blocks of replication? No. Ok… (disappointed face)

What is the cheapest SQL database ? l ibsql: fork of SQLite. Everybody welcome. Major changes: write-ahead-log virtualization hooks. server-mode

What is the cheapest SQL database ? Built on libsql for massive replication. Serverless or Embedded. Primary writes, replicated reads. Scale-to-zero. 1 command replication, single URL routing. Multitenancy.

What does it look like? Vercel’s Edge benchmark 5 waterfall queries from edge compute locations. Client in Brazil. Postgres Purple : latency from the client’s preferred location Blue: latency from the DB’s preferred location

What does it look like?

Summary Moving data close to users can eclipse lots of optimization work. Partition and replication are really the only way to do it. To make replication viable, it needs to be: Cheap, easy to set up, and easy to use.

Glauber Costa [email protected] @glcst https://turso.tech Thank you! Let’s connect.
Tags