Non preemptive
A process will eventually misbehave...
Round Robin (Time sharing)
(source: http://www.qnx.com/developers/docs/qnx_4.25_docs/qnx4/sysarch/microkernel.html)
but.. tasks are different in nature!
●Some tasks require CPU time to complete
●others require I/O (waiting for user input,
network data, disk)
●Some just sleep most of the time
Multilevel feedback queue
And many more..
So, how do you interrupt a running
process in the middle of its execution,
and then resume the work later on?
Use a timer interrupt!
●Ask the processor to interrupt the active process and wake up
the kernel every X interval (in the goold old Linux days, this was
defined as CONFIG_HZ):
The “context” in this program
bits 16 ; 16 bits real mode
start:
cli ; disable interrupts
mov si, msg ; SI points to RAM address of msg
mov ah, 0x0e ; print char service, with int 0x10
.loop lodsb ; load RAM: AL <- [DS:SI] && SI++
or al, al ; end of string?
jz halt
int 0x10 ; call BIOS print char
jmp .loop ; next char
halt: hlt ; halt
msg: db "Hello, World!", 0
# void swtch(struct context **old, struct context *new);
#
# Save current register context in old
# and then load register context from new.
# Save old callee-save registers
pushl %ebp ; Save old process ebp (base pointer) register
pushl %ebx ; Save old process ebx (general) register
pushl %esi ; Save old process esi (source index) register
pushl %edi ; Save old process edi (destination index) register
# Switch stacks
movl %esp, (%eax)
movl %edx, %esp
# Load new callee-save registers
popl %edi ; Load new process edi
popl %esi ; Load new process esi
popl %ebx ; Load new process ebx
popl %ebp ; Load new process ebp
ret
Finding out the context switch rate
[us-east-1c] i-91xxxxxx [root:~]$ vmstat -w 10
procs ---------------memory-------------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 3275668 48308 370264 0 0 0 0 12504 8884 1 5 94 0 0
0 0 0 3275492 48308 370264 0 0 0 0 12933 8834 2 5 93 0 0
1 0 0 3275588 48308 370264 0 0 0 0 12490 8825 2 4 94 0 0
Using /usr/bin/time
[us-east-1c] i-xxxxxxxx [root:~]$ /usr/bin/time -v find / > /dev/null
Command being timed: "find /"
User time (seconds): 0.12
System time (seconds): 0.30
Percent of CPU this job got: 6%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.79
...
Voluntary context switches: 7362
Involuntary context switches: 60
...
File system inputs: 58888
Voluntary vs Involuntary
●Non voluntary context switches occur when the
scheduler interrupts the process (i.e. timeslice expired)
●Voluntary context switches occur when the process
yields control to the CPU (i.e. waiting for I/O, sleeping)
Using pidstat
[us-east-1c] i-xxxxxxxx [root:~]$ pidstat -w -p 1504
Linux 3.13.0-55-generic (ip-10-xx-xx-x8) 06/21/2015 _x86_64_ (2 CPU)
●Kernel tries very hard to schedule threads on
the same Core/CPU/Numa node, but
sometimes it can't.
Without CPU affinity
(source: http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html)
With CPU affinity
(source: http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html)
Without CPU affinity With CPU affinity
(source: http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html)
Full context switch
●Save current process context
●TLB flush
●Potentially screw up L1,L2,L3 CPU cache
●Load next process context
●Can be done in hardware - but is usually
done in software.
So how much is too much?
10k c.s. @ 5 micro-seconds per c.s. = 50ms
CPU time spent on context switching.
If this was 100k...
Some formulas
●IO bound tasks - threads = number of cores * (1 + wait time / service time)
●Balanced - N = U * Ncores * (1+ I/O Wait time / CPU time) ^^ same as above without the U factor
●CPU bound tasks - threads = number of CPUs + 1 ^^ same as above but W == 0
see y i like this version of the formula? yes.
I think this is a pretty good presentation.. :P
yep, but it looks like we’ll keep adding slides until tomorrow night…. lol
Most probably you’re right - but this covers great things
we have to stop somewhere. also, we need to get drunk ;)
Conclusions
●Scheduling is complicated
●Context switches can be very expensive
●Find the right balance when deciding how many workers
to use
●Use CPU affinity to reduce cost where applicable
●And most important lesson, don't overload the scheduler
with runnable threads/processes
Further reading
●The Timer Interrupt Handler
●Context Switching on X86
●CPU Scheduling
●How long does it take to make a context switch?
●Calculate the Optimum Number of Threads