Zero-overhead Container Networking with eBPF and Netkit by Liz Rice

ScyllaDB 1,109 views 24 slides Oct 16, 2024
Slide 1
Slide 1 of 24
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24

About This Presentation

Introducing Netkit: a new eBPF enhancement replacing veth connections in container networking. Say goodbye to the overhead slowing down container apps. With Netkit, container networking now matches the speed of host networking. Fast, efficient, and ready to deploy. #DevOps #eBPF


Slide Content

A ScyllaDB Community
Zero-overhead Container Networking
with eBPF and Netkit
Liz Rice
Chief Open Source Officer, Isovalent @ Cisco

?????? Hi, I’m Liz (she/her)
Chief Open Source Officer, Isovalent @ Cisco
■Previously chair of CNCF’s Technical
Oversight Committee
■Early career: writing networking code
■Containers / security / eBPF / cloud
native
■Often found on a bike or playing music

Container networking overhead

Traditional host networking
Host
app
eth0
network stack
(IP, netfilter,
routing, …)
Network interface card
socket

Container networking
Host
Container app
eth0
network stack
(IP, netfilter,
routing, …)
network interface card
veth
veth
virtual
Ethernet
connection
Host’s network
namespace
Container’s network
namespace
network
stack
socket

Container networking
Host
Container app
eth0
network stack
(IP, netfilter,
routing, …)
network interface card
veth
veth
virtual
Ethernet
connection
Host’s network
namespace
Container’s network
namespace
network
stack
socket
Traffic passes
through network
stack twice ??????
TCP back
pressure
breakages ??????

Use eBPF programs to do
host routing
New eBPF helper functions in 5.10

Host routing with eBPF - ingress
Host
Container app
eth0
network stack
(IP, netfilter,
routing, …)
veth
veth
Host’s network
namespace
Container’s network
namespace
network
stack
socket
eBPF
programs
Ingress: bpf_redirect_peer()
Switch packet’s netns✕

Host routing with eBPF - egress
Host
Container app
eth0
veth
veth
Host’s network
namespace
Container’s network
namespace
network
stack
socket
eBPF
programs
Egress: bpf_redirect_neigh()
FIB lookup + fill in L2 info
network stack
(IP, netfilter,
routing, …)✕

Host routing with eBPF
Host
Container app
eth0
veth
veth
Host’s network
namespace
Container’s network
namespace
network
stack
socket
eBPF
programs
network stack
(IP, netfilter,
routing, …)✕
eBPF
programs

So far so good, but still some overhead

veth backlog queue

12
Deferral to
ksoftirqd

Host routing with eBPF
Host
Container app
eth0
veth
veth
Host’s network
namespace
Container’s network
namespace
network
stack
socket
eBPF
programs
network stack
(IP, netfilter,
routing, …)✕
eBPF
programs
Egress packets
can go on per-CPU
backlog queue ??????
Ingress packets
skip per-CPU
backlog queue

tcx - TC express
TC = traffic control subsystem
in kernel network stack

eBPF program attachment in TC
Before kernel 6.6

eBPF program attachment in TC
Before kernel 6.6 New style: tcx
59 cycles -> 33 cycles to
BPF program entry

Replace veth devices
with netkit devices
Minimal eBPF-programmable devices

Netkit devices
Host
Container
netkit
primary
netkit
peer
Host’s network
namespace
Container’s network
namespace
eBPF
programs
eBPF
programs
Only primary can manage eBPF
programs

Netkit devices
Host
Container app
eth0
netkit
netkit
Host’s network
namespace
Container’s network
namespace
network
stack
socket
eBPF
programs
eBPF
programs
Fast netns
switching and no
queuing for both
ingress and
egress
network stack
(IP, netfilter,
routing, …)✕

veth vs netkit: backlog queue

20
Pod with veth:
Pod with
netkit:
Remains in process
context all the way,
leading to better process
scheduler decisions.
Deferral to
ksoftirqd

Container networking overhead eliminated!
Throughput as
high as host
baseline

Container networking overhead eliminated!
Latency as
low as host
baseline

Netkit support now in Cilium

Thank you!
Liz Rice
@lizrice
isovalent.com/blog
Tags