Optimizing Scheduler for Linux Gaming.pdf

igalia 230 views 30 slides Apr 29, 2024
Slide 1
Slide 1 of 30
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30

About This Presentation

This talk dives into how the scheduler impacts your gameplay on Linux and
unveils our journey to smoother gameplay. How does task scheduling impact Linux
gaming? Suboptimal task scheduling can cause stuttering while playing games on
the Steam Deck game console. First, we nail down the enemy. What ex...


Slide Content

Optimizing Scheduler for Linux Gaming

Changwoo Min
[email protected]
April 17, 2024

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Talk outline
●Linux gaming is here to stay!
○Does scheduler matter in gaming experience?

●Performance metrics for optimization
○How to measure “stuttering” and what metric to optimize for?

●Gaming workload characterization
○Are there any peculiar properties in gaming that a scheduler can leverage?

●sched_ext: BPF-based extensible scheduler framework
○Easy to experiment new scheduler without pain

●Latency-criticality Aware Virtual Deadline (LAVD) scheduler
○Scheduler should consider the latency criticality of a task
2

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Linux gaming is here to stay
3
Image source: https://store.steampowered.com/steamdeck/

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Linux gaming is here to stay
SteamDeck runs x86 Windows games!

●AMD APU
○AMD Van Gogh: x86 8-core CPU + AMD GPU

●SteamOS
○Arch-Linux based Linux distribution
○with optimizations for SteamDeck hardware (especially AMDGPU)

●Proton
○Compatibility layer for Windows games on Linux
○Includes patched version of Wine
○Plus a lot more: DXVK, VKD3D, gstreamer, ffmpeg
4

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

My game is stuttering
even after upgrading my PC :-(
5
Image source:
https://www.reddit.com/r/AM
DHelp/comments/172sjtj/gam
es_keep_on_stuttering_i_just_
upgraded_my_pc/
https://steamcommunity.com/
discussions/forum/1/5824898
961055124750/

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Does scheduler matter
in gaming experience?
Yes, it does.
While most games are GPU-heavy, CPU is still the controller.

6
Scheduler A
Avg FPS: 41.3
Low 1% FPS: 29.5
Scheduler B
Avg FPS: 34.2
Low 1% FPS: 4.6
Time
Frame
Per
Second

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

What metrics to optimize for?
7
●When we say “performance”, what exactly does it mean?

●Performance means many different things
○Throughput: how much job can be done in a given time
○Latency: time to process a task
○Tail latency (99p, 99.9p): latency of the worst 1% or 0.1%

●The optimization strategy could vary depending on the optimization
metrics
○E.g., throughput vs. latency vs. tail latency optimized design

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

How to quantify “stuttering”?
8
●In games, FPS (frame per second) or frame time are the widely used
metric
○Average FPS ≈ throughput of FPS
○Low 1% FPS ≈ 99p latency of FPS
○Low 0.1% FPS ≈ 99.9p latency of FPS

●Not only average FPS but also low 1% FPS are critical in user
experience. A sudden FPS dip is also known as a stuttering problem.

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Talk outline
●Linux gaming is here to stay!
○Does scheduler matter in gaming experience?

●Performance metrics for optimization
○How to measure “stuttering” and what metric to optimize for?

●Gaming workload characterization
○Are there any peculiar properties in gaming that a scheduler can leverage?

●sched_ext: BPF-based extensible scheduler framework
○Easy to experiment new scheduler without pain

●Latency-criticality Aware Virtual Deadline (LAVD) scheduler
○Scheduler should consider the latency criticality of a task
9

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Understanding game workloads
●It is critical to understand the characteristics of games to optimize
scheduling policy for games.

●If games have the common characteristics, which are differentiated
from other general program domains (e.g., HPC, AI/ML, database), there
are chance to come up with a game optimized scheduling policy.

●We collect and analyze all scheduling activities while playing games.

10

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Analyzed a lot of scheduling traces
11

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Task scheduling and CPU utilization
12
●Around 300 tasks are scheduled while running a game.
○Around 90% are long-living tasks; only 10% of tasks are terminated.

●Top 30-40 most frequently scheduled tasks take 95% of scheduling.
○Around half of them are system tasks -- especially, wine,
graphics, and audio servers, taking 30--40% of scheduling.
○There are 15-20 game-specific tasks, which takes 60-70%
scheduling.

●CPU utilization is moderately high -- around 65-95%, but not
overloaded (i.e., no 100%).

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Task execution time per schedule
13
●In general, tasks run for very short duration – roughly a few 100s usec
on average to a few msec maximum.

●Task execution time is very stable and is predictable using its average.

Distribution of task’s runtime
average per schedule in a game
tasksruntime avg (msec)
○Some coordination tasks run very
shortly (e.g., wineserver:260 usec) but
some work tasks (e.g., a task worker:
1.65 msec) run longer than average. < 1 msec

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

What makes scheduling happen?
14
●Preemption (e.g., timer interrupt) takes only 25-30% of scheduling.

●70-75% of scheduling is initiated by waiting system calls – such as
epoll, pipe_read, futex_wait , etc.

●In this case, a task is first scheduled out waiting for an event then it is
scheduled in when the event is notified.

Call stacks which triggered
scheduling in a game

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Task waiting time
15
●Wait time from one schedule to another schedule is a task’s behavioral
property. Each task’s wait time is pretty constant.
Wait times over time in a game

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Tasks are tightly linked in a graph
16
●Task graphs of top 50 percentile of waiter-wakers

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Summary of game workloads characteristics
17
●Majority of game tasks shows periodic behavior
○given the fact that a task has a stable execution time and a stable
wait time.

●A sequence of tasks serves for a single job (e.g., from a user input to
display update).

●Therefore, the accumulated scheduling delay in a task chain will affect
end-user’s experience; it would eventually result in a sudden spike of
frame time (low 1% FPS).

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Talk outline
●Linux gaming is here to stay!
○Does scheduler matter in gaming experience?

●Performance metrics for optimization
○How to measure “stuttering” and what metric to optimize for?

●Gaming workload characterization
○Are there any peculiar properties in gaming that a scheduler can leverage?

●sched_ext: BPF-based extensible scheduler framework
○Easy to experiment new scheduler without pain

●Latency-criticality Aware Virtual Deadline (LAVD) scheduler
○Scheduler should consider the latency criticality of a task
18

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

sched_ext: BPF-based
extensible scheduler framework
19
●sched_ext enables scheduling policies to be implemented in a BPF program

●The development procedure is somewhat similar to kernel module
programming but BPF guarantees safety (e.g., no memory bugs)
○Write a scheduler policy in BPF
○Compile it
○Load it onto the system, letting BPF and sched_ext infrastructure do all
of the heavy lifting to enable it

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

sched_ext: BPF-based
extensible scheduler framework
20
●Rapid experimentation is possible
○No reboot required, yay!
○BPF cannot crash the host machine!

●GPLv2 only

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Implementing scheduling policies
with sched_ext
21
●BPF program must implement a set of callbacks

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Latency-criticality Aware Virtual Deadline
(LAVD) scheduler
22
●A new scheduling algorithm motivated by gaming workloads on Linux
○Implemented as a BPF scheduler on top of the sched_ext framework

●Proportional share scheduler
○Respect nice value to pursue the fair use of CPU time

●(Virtual) deadline based algorithm
○Task’s deadline to represent the urgency of a task scheduling
○But it is not a real-time scheduling algorithm

●Latency-criticality is the first class factor in making scheduling decisions
○Infer a latency criticality of a task from task’s behavior in a task graph

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Overall procedure of LAVD
23
●LAVD is a deadline-based scheduling algorithm, so its overall procedure is
similar to other deadline-based scheduling algorithms.

●Under LAVD, a runnable task has its time slice and virtual deadline,
which are decided by the scheduler.

●The LAVD scheduler picks a task with the earliest virtual deadline and
allows it to execute for the given time slice.

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Defining latency criticality of a task
24
●We leverage a task's communication and behavioral properties to quantify
its latency criticality.







●Suppose there are three tasks, Task A, B, and C, coordinated by any waiting
calls (e.g., epoll, futex, etc).
○ [Task A] --> [Task B] --> [Task C]

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Defining latency criticality of a task
25
● [Task A] --> [Task B] --> [Task C]

●We define Task B is more latency-critical in the following cases:
○as Task B's runtime per schedule is shorter (runtime B)
■If Task B's runtime per schedule is long, a relatively short scheduling
delay won't affect a lot.
○as Task B wakes Task C more frequently (wake_freq B)
■If Task B frequently wakes up Task C, the scheduling delay of Task B
also delays the execution of Task C.
○as Task B waits for Task A more frequently (wait_freq B)
■If Task B often waits for Task A, the scheduling delay of Task B
delays the completion of executing the task graph.

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Task’s (virtual) deadline
26
●The latency criticality of a task is used to determine task's virtual deadline.

●A more latency-critical task will have a tighter (shorter) deadline, so the
scheduler picks such a task more urgently among other runnable tasks.

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Task’s time slice
27
●The LAVD scheduler tries to schedule all runnable tasks at least once
within a predefined time window, which is called a targeted latency (e.g.,
15 msec).

●The scheduler proportionally divides the targeted latency as per task’s nice
priority (i.e., weight).

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Enforcing the fair of use CPU time
28
●Assigning a task's time slice per its priority does not guarantee the fair use
of CPU time.
○That is because a task can be more frequently executed than other
tasks or yield CPU before entirely consuming its assigned time slice.

●Deferring task’s execution
○LAVD defers picking over-scheduled tasks to reduce the frequency of
task execution.
○The deferring duration is proportional to task’s over-scheduled time.

Optimizing Scheduler for Linux Gaming
Changwoo Min, April 17, 2024

Early result is promising
29
●In many cases, LAVD provides better or similar performance than EEVDF in
terms of average FPS and Low 1% FPS
LAVD
EEVDF
6.9-rc1
upstream kernel

30
Join us!