Noisy Neighbor Detection with eBPF by Jose Fernandez

ScyllaDB 1,270 views 22 slides Oct 16, 2024
Slide 1
Slide 1 of 22
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22

About This Presentation

Tackling "noisy neighbor" issues in multi-tenant setups! At Netflix, we use eBPF to monitor and mitigate excessive CPU usage in real-time. Learn how we instrument the Linux scheduler, optimize eBPF, and maintain high performance. Get actionable insights for your infrastructure. #DevOps #eB...


Slide Content

A ScyllaDB Community
Noisy Neighbor Detection
with eBPF
Jose Fernandez
Senior Software Engineer

Jose Fernandez (he/him)

Senior Software Engineer at Netflix
■@Netflix: Cloud Infrastructure, Compute team
■Specialize in observability, performance, and
efficiency
■I created and maintain bpftop
■Outside of work, I enjoy spending time with family,
hiking in the CO Rocky Mountains, playing
pickleball, and gaming.

The noisy neighbor problem
■Netflix Context: On Titus, our multi-tenant compute platform, a "noisy
neighbor" refers to a container or system service that heavily utilizes server
resources, causing performance degradation in adjacent containers.
■CPU utilization is our primary focus due to its frequent role in noisy neighbor
issues.
■Leads to degraded user experience and support burden for infrastructure
teams.

The blame game

Traditional detection methods
■Limitations of Traditional Tools:
●Tools like perf can introduce significant overhead.
●Risk further performance degradation.

■Reactive Deployment:
●Typically used after issues occur—too late for effective investigation.

■Expertise Barrier:
●Debugging requires deep low-level expertise and specialized tools.
●This isn't scalable or efficient for rapid problem-solving.

Requirements for a solution
■Continuous, Real-Time Instrumentation
■Minimal Performance Impact
■Deep Kernel-Level Visibility
■Handles Netflix-scale
■Accessible to Non-Experts

eBPF as the Solution


■Continuous, Real-Time Instrumentation
●eBPF allows us to develop programmable monitoring solutions tailored to our needs.
■Minimal Performance Impact
●eBPF programs are executed within the kernel and are highly optimized, they introduce minimal overhead.
■Deep Kernel-Level Visibility
●eBPF provides access to low-level scheduling events and kernel data structures
■Handles Netflix-scale
●eBPF scales efficiently across our multi-tenant environment, handling monitoring at the scale we require
■Accessible to Non-Experts
●eBPF allows us to emit metrics that power dashboards in an understandable format.

Continuous Instrumentation of the Linux Scheduler

sched_wakeup & sched_wakeup_new
SEC("tp_btf/sched_wakeup")
int tp_sched_wakeup(u64 *ctx)
{
struct task_struct *task = (void *)ctx[0];
u32 pid = task->pid;
u64 ts = bpf_ktime_get_ns();

bpf_map_update_elem(&runq_enqueued, &pid, &ts, BPF_NOEXIST);
return 0;
}

sched_switch
SEC("tp_btf/sched_wakeup")
int tp_sched_wakeup(u64 *ctx)
{
struct task_struct *task = (void *)ctx[0];
u32 pid = task->pid;
u64 ts = bpf_ktime_get_ns();

bpf_map_update_elem(&runq_enqueued, &pid, &ts, BPF_NOEXIST);
return 0;
}

kfuncs
SEC("tp_btf/sched_wakeup")
int tp_sched_wakeup(u64 *ctx)
{
struct task_struct *task = (void *)ctx[0];
u32 pid = task->pid;
u64 ts = bpf_ktime_get_ns();

bpf_map_update_elem(&runq_enqueued, &pid, &ts, BPF_NOEXIST);
return 0;
}

ringbuffer
SEC("tp_btf/sched_wakeup")
int tp_sched_wakeup(u64 *ctx)
{
struct task_struct *task = (void *)ctx[0];
u32 pid = task->pid;
u64 ts = bpf_ktime_get_ns();

bpf_map_update_elem(&runq_enqueued, &pid, &ts, BPF_NOEXIST);
return 0;
}

eBPF rate limiter
SEC("tp_btf/sched_wakeup")
int tp_sched_wakeup(u64 *ctx)
{
struct task_struct *task = (void *)ctx[0];
u32 pid = task->pid;
u64 ts = bpf_ktime_get_ns();

bpf_map_update_elem(&runq_enqueued, &pid, &ts, BPF_NOEXIST);
return 0;
}

Continuous Instrumentation of the Linux Scheduler

Continuous Instrumentation of the Linux Scheduler

container1’s 99th percentile runq.latency averages 83µs (microseconds), with spikes up to 400µs

Launching container2 at 10:35, which maxes out all CPUs on the host, caused a 131-millisecond spike
in container1’s P99 run queue latency

Improving eBPF stats calculation


■Identified improvement opportunity in eBPF stats calculation
■Patch included in Linux kernel 6.10 release
■Removes capturing some instrumentation overhead
■https://tinyurl.com/ebpf-stats-fix

Optimizing eBPF code


■BPF_MAP_TYPE_HASH: Most performant for enqueued timestamps.
■BPF_MAP_TYPE_TASK_STORAGE: Nearly twofold performance decline.
■BPF_MAP_TYPE_PERCPU_HASH: Slightly less performant; reason unclear.
■BPF_CORE_READ Helper: Adds 20-30 ns. Direct access for BTF-enabled
tracepoints recommended.
■BPF_MAP_TYPE_LRU_HASH: 40-50 ns slower. Adjusted size to reduce space
concerns.
■Kernel Tasks (PID 0): Avoided costly operations with early exits and
conditional logic.

Conclusion
●eBPF was proofed invaluable for scheduler instrumentation and monitoring.

●Solution enabled self-serve noisy neighbor detection.

●Recognized the importance of tools like bpftop for optimizing eBPF
code.

●Expect more infrastructure observability and business logic to shift to
eBPF.
○sched_ext: Potential to revolutionize scheduling decisions tailored to
workload needs.

Thank you! Let’s connect.
Jose Fernandez
josef@netflix.com
@jrfernandez
jrfernandez.com
tinyurl.com/noisy-neighbor-detection
Tags