Patterns of Low Latency by Pekka Enberg

ScyllaDB 3,609 views 43 slides Oct 17, 2024

Slide 1 of 43

About This Presentation

Building for low latency is important, but the tips and tricks are often part of developer folklore and hard to discover on your own. This talk shares some of the important latency related patterns you want to know when working on low latency apps.

Size: 1.53 MB

Language: en

Added: Oct 17, 2024

Slides: 43 pages

Slide Content

A ScyllaDB Community
Patterns of Low Latency
Pekka Enberg
Founder/CTO at Turso

Pekka Enberg

Founder/CTO at Turso
■Previously ScyllaDB and Linux
■P99 happens often, the conference should be
P99.999

■Why is latency important?
■Measuring latency
■Reducing latency
■Hiding latency
■Tuning the system
■Summary
Outline

Why is latency important?

Low latency = good customer experience
■Bad latency often means unhappy customers.

■It’s very easy to write code with bad tail latency.

■Many users can experience bad latency of just one component.

■Tail latency is the high percentiles (e.g. 99th) of the latency distribution.

■Users experience tail latency fairly often because latency compounds.

■Example: If you fan out processing to 10 components and wait for all of them
to complete, 9% of user requests experience the 99th percentile latency.
●P(at least one slow) = 1 - P(fast request)ⁿ = 1 - (1 - P(slow request))ⁿ
Users experience tail latency often
Dean and Barroso. (2013) The Tail at Scale. Communications of the ACM

■Why is latency important?
■Measuring latency
■Reducing latency
■Hiding latency
■Tuning the system
■Summary
Outline

Measuring latency

Primer on measuring latency
■Latency is a distribution so measure it as such.

■Average latency is not an interesting metric (it’s the inverse of throughput).

■Maximum latency is an interesting metric, but hard to optimize.

■The 99th percentile latency and beyond is a good compromise.

■In coordinated omission, benchmark accidentally coordinates with the system
being measures.

■Outliers in the latency distribution are not measured.
Beware coordinated omission
Ivan Prisyazhynyy On Coordinated Omission (2021)

■Visualizing latency is important

■Histograms are good…

■…but eCDFs can be even better!
Visualizing latency
Marc Brooker Histogram vs eCDF (2022)

Examples

Example of latency histogram.
The x-axis represents percentiles and y-axis represents the latency.

Example of latency eCDF.
The x-axis represents the latency and the y-axis represents the cumulative density (probability)

■Why is latency important?
■Measuring latency
■Reducing latency
■Hiding latency
■Tuning the system
■Summary
Outline

Reducing latency

■Latency lurks everywhere.

■To reduce latency:
●Avoid data movement
●Avoid work
●Avoid waiting
Reducing latency

Avoiding data movement

Avoiding data movement
■Moving data is slow:
●Network round-trip between New York and London is at least 57 ms.
●Data center network round trip is about 100-500 μs.
●DRAM access latency is 100 ns.

■Move data where it is used:
●Colocation
●Replication
●Caching

■Colocation is a technique to reduce latency by moving two components close
to each other.

■For example, move a database in the same machine where your application
logic is to eliminate network round-trip latency.

■Colocation is not always possible, unless you have multiple copies…
Colocation

■Replication and caching are techniques to reduce latency by having a copy of
the data close to where it is used.

■Both techniques have their pros and cons around consistency.

■Often a good technique to reduce latency if you can deal with the storage
ampliﬁcation.
Replication and caching

Avoiding work

■You can reduce latency by doing less:
●Tame algorithmic complexity
●Control memory management
●Optimize your code
●Avoid CPU-intensive computation
Avoiding work

■Understand if the algorithm you use is suitable for low latency:
●O(1) algorithm is (probably) ﬁne.
●O(n^2) algorithm and worse is (probably) not ﬁne.

■Data structures you likely see in low latency code:
●Queues and stacks
●Arrays
●Hash tables

■Data structures you probably won’t see in low latency code:
●Linked lists
●Graphs

Taming algorithmic complexity

■Avoid dynamic memory management
●Allocating memory in fast-path is likely a source of latency outlier.
●You can do low latency with a pauseless GC, but you still need to avoid allocating objects.

■Avoid demand paging
●Virtual memory is an illusion of memory space as large as the disk space.
●This means virtual memory (in the worst case) runs at disk speed.
●Avoid demand paging.
Controlling memory management

■Optimizing code can reduce latency
●Reducing CPU cycles, cache misses, and so on is needed for low latency

■Find bottlenecks with proﬁler, optimize, and repeat the process.

■Beware of optimizing work at the expense of something else.
●For example, batching can reduce CPU cycles, but increase latency.
Optimizing your code

■CPU-intensive computation can hurt latency.

■Avoid long-running tasks by splitting the work.
Avoid CPU-intensive computation

Avoiding waiting

■Eliminate synchronization

■Use wait-free synchronization

■Don’t wait for the OS

■Don’t wait for the network
Avoiding waiting

■Synchronization such as mutual exclusion means threads wait

■Partition data to eliminate synchronization
●Thread-per-core is effective because CPUs run independently of each other.

■Make shared data structures read-only when possible
Eliminate synchronization

■Wait-free synchronization if you can’t partition data.

■Wait-free = ﬁnite steps, no waiting

■For example, single producer, single consumer queues are a great low level
primitive for low latency request processing.
Use wait-free synchronization
Herlihy. (1991) Wait-free synchronization. TOPLAS

■Avoid context switching
●Don’t create too many threads, be easy on system calls.

■Use non-blocking I/O
●Blocking a kernel thread on I/O is guaranteed bad tail latency.

■Use busy-polling (if energy is not a concern)
●It can be faster to poll for an event than wait for it.

■By-pass the kernel (if you can)
●For example, XDP and DPDK provide ways to by-pass the OS network stack for lower latency.
Don’t wait for the OS

■Disable Nagle's algorithm (use TCP_NODELAY)

■Avoid head-of-line blocking
Don’t wait for the network

■Why is latency important?
■Measuring latency
■Reducing latency
■Hiding latency
■Tuning the system
■Summary
Outline

Hiding latency

■Parallelize request processing
●Perform request processing in parallel instead of serially to reduce latency.

■Hedge requests
●Send request to multiple servers and use results from fastest one.

■Light-weight threads
●For example, GPUs hide latency by executing massive amounts of light-weight threads in
parallel, which hides memory access latency.
Hiding latency

■Why is latency important?
■Measuring latency
■Reducing latency
■Hiding latency
■Tuning the system
■Summary
Outline

Tuning the system

■Conﬁgure CPU frequency scaling

■Isolate CPUs for application threads

■Disable swap

■Conﬁgure network stack interrupt aﬃnity
Tuning the system

■Why is latency important?
■Measuring latency
■Reducing latency
■Hiding latency
■Tuning the system
■Summary
Outline

Summary

■Latency is a distribution; measure and visualize it as such, and watch out for
coordinated omission.

■Avoid data movement, work and waiting to reduce latency.

■Hide latency if you can’t reduce it.

■Tune your system for low latency.
tl;dr

Thank you! Let’s connect.
Pekka Enberg
penberg@iki.ﬁ
@penberg
http://penberg.org
50% discount:
P992024

Patterns of Low Latency by Pekka Enberg

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Patterns of Low Latency by Pekka Enberg

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......