Scheduler Tracing With ftrace + eBPF by Jason Rahman

ScyllaDB 375 views 23 slides Oct 15, 2024
Slide 1
Slide 1 of 23
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23

About This Presentation

Dive into understanding app latency by exploring the Linux scheduler with ftrace, eBPF, and Perfetto for visualization. Uncover quirks in the CFS scheduler and get a glimpse of the new EEVDF scheduler in recent kernels. #Linux #eBPF #Performance #DevOps


Slide Content

A ScyllaDB Community
Linux Scheduler
Tracing With Ftrace
Jason Rahman
Principal Software Engineer at Microsoft

Jason Rahman

Principal Software Engineer at Microsoft
■Working On Azure PostgreSQL These Days
■Previously @ Meta Working On MySQL & Distributed File Systems
■Enjoy Learning About The Intricacies Of Operating
Systems + Hardware
■In My Free Time: Adjunct Instructor + Landscape
Photography
■(IG: @jrahman.photography)

Linux Scheduler Tracing

Linux ftrace subsystem
Linux has a tracing subsystem built into the kernel itself.
■Tracepoints defined across various kernel subsystems
●This includes events like sched/sched_switching
●Can reconstruct full scheduling timeline
■Configured via tracefs
●Enable events with `echo 1 > /sys/kernel/tracing/events/${EVENT}/enable`
●Read trace buffers with ` cp /sys/kernel/tracing/trace ./trace `

Lockfree, low overhead per-CPU ring buffers

Perfetto Trace Viewer
Data Is Helpful, Visualizations Are Better
■dev.perfetto.ui Has Native Ftrace Support

Service Threading Architecture

Asynchronous IO Application Architecture

Respond Back To Source Thread(s)

Key Workload Behaviors
Each group of threads has specific behaviors and performance requirements:
■IO Threads
●Cooperative multi-tasking, avoid long running tasks that block others
●Short bursts of CPU upon IO completion
●Latency sensitive, need to get IO to/from disk/network ASAP
●CPU Threads
●Work-stealing, shared work queue to distribute long running tasks
●Longer running CPU heavy tasks

Worker Thread Wake Up To Process Queue Entry

IO Thread Wake Up To Send Response

Asynchronous IO Application Latency
With at least 2x thread hops per request (likely more in practice including
downstream API calls), scheduler behavior on wakeup has significant impact on
latency
■CPU Tasks Occupy CPU Core For Long Periods Of Time
■IO Threads Need To Wake Up Quickly To Service IO
■The former often interferes with the latter, as we’ll see
■Need high fidelity to debug scheduling problems that occur

Let’s Try It Out!

Sample Ftrace Usage
EVENTS=“sched/sched_waking sched/sched_switch sched/sched_wakeup_new”
for EVENT in ${EVENTS}; do
sudo bash -c 'echo 1 > /sys/kernel/tracing/events/${EVENT}/enable’
done
sudo bash -c 'echo > /sys/kernel/tracing/trace'
sudo bash -c 'echo 1 > /sys/kernel/tracing/tracing_on'
sleep 1
sudo bash -c 'cp /sys/kernel/tracing/trace ./trace'
sudo bash -c 'echo 0 > /sys/kernel/tracing/tracing_on’
for EVENT in ${EVENTS}; do
sudo bash -c 'echo 0 > /sys/kernel/tracing/events/${EVENT}/enable’
done

Load Generator - PWM
let ticks_per_us = ticks_per_us();
loop {
let start = tick();
let target = start + run_us * ticks_per_us;
while tick() < target {
unsafe {
x86_64::_mm_pause();
}
}
std::thread::sleep(Duration::from_micros(period_us - run_us));
}

3 Tasks Consuming 65% Of 2 Cores => Moderate Load

Task Runnable + Core Left Idle => Not Work Conserving!

Per CPU Runqueue

CFS Failing To Load Balance – Overload On Wakeup Bug

A Picture Is Worth 1000 Words, A Few Good Traces Are Worth A Research Paper

Lessons
What We Learned Today
■Ftrace
●Powerful Tool For Detailed Performance Analysis Accessible via Tracefs
●Built Into Linux, No Additional Tools Needed
●Perfetto
●Quick + Lightweight Trace Viewer In Browser
■Performance Analysis
●Linux Scheduler Has Bugs + Poor Behaviors
●CPU Load Is Average Measurement
■Need Traces To Find Find Grained Latency Issues

Thank you! Let’s connect.
Jason Rahman
[email protected]
https://jprahman.substack.com
Tags