Building a Fast Lock-free Queue for Trading Systems by Sarthak Sehgal

ScyllaDB 1 views 34 slides Oct 15, 2025

Slide 1 of 34

About This Presentation

When every microsecond counts, inter-thread communication must be lean and predictable. This talk dives into the design of a high-performance SPSC bounded queue for ultra-low latency trading systems. We’ll cover eliminating locks with atomics, reducing sync costs with memory ordering, and avoiding...

Size: 2.85 MB

Language: en

Added: Oct 15, 2025

Slides: 34 pages

Slide Content

A ScyllaDB Community
Building a Fast Lock-free
Queue for Trading Systems
Sarthak Sehgal
Tech Lead

Sarthak Sehgal he/him

Tech Lead at Maven Securities
■Working at a high frequency options market making ﬁrm
■Interested in ﬁnance, low level programming, and C++
under the hood
■Sometimes, I write about C++ at sartech.substack.com

Overview
■Motivation & Problem Setup
■Building an SPSC queue using std::atomic
■Optimizations
●Memory Ordering
●Cache alignment
●Variable caching

Motivation

Motivation
■On Nasdaq:
●1 million msgs/s
●Peaks at 3-4 million msgs/s at open/close
Exchange

Source: Stefan Schlamp LinkedIn

Problem Setup

Problem Setup
■One producer and one consumer
■Both threads are pinned on separate physical cores
■Fixed size buffer (bounded queue)

Queue using std::atomic

Producer

Consumer

Instruction Execution Order in CPU

Out-of-order execution

Out-of-order execution
Out of order atomic update and memory access leads to incorrect behavior

Aspects of std::atomic
1.Atomicity - operations on the atomic object are indivisible

Aspects of std::atomic
1.Atomicity - operations on the atomic object are indivisible
2.Memory Ordering - determines how memory access (atomic and non-atomic)
surrounding an atomic operation are sequenced. This is used for memory
access synchronization across threads.
Fortunately, the default load and store operations ensure that reordering does not occur. This behavior is
governed by memory ordering. As we will uncover, memory ordering has a significant impact on the
performance.

Memory ordering in atomic

std::memory_order_relaxed
■No ordering constraints for reads or writes around the atomic variable
■Only ensures atomicity

Optimization: Using relaxed ordering

std::memory_order_acquire / release

Optimization: Using memory ordering

False Sharing

False Sharing
■False sharing occurs when two or more threads access different variables
that are located on the same cache line

Results

Checkout my slides and blog for the complete code

Resources
■Fedor Pikus’ introductory talk on std::atomic and memory ordering
■Source code of boost spsc lockfree queue
■Herb Sutter’s atomic weapons talk

Thank you! Let’s connect.
Sarthak Sehgal
[email protected]
linkedin/sarthaksehgal99
sartech.substack.com

Building a Fast Lock-free Queue for Trading Systems by Sarthak Sehgal

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Building a Fast Lock-free Queue for Trading Systems by Sarthak Sehgal

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx