Vmlinux: anatomy of bzimage and how x86 64 processor is booted

AdrianHuang 1,915 views 73 slides Jun 14, 2021
Slide 1
Slide 1 of 73
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73

About This Presentation

This slide deck describes the Linux booting flow for x86_64 processors.

Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).


Slide Content

vmlinux: Anatomy of bzimageand how
x86_64 processor is booted
Adrian Huang | May,2021
* Based on kernel 5.11 (x86_64) –QEMU
* Legacy BIOS

Agenda
•bzimage: high-level overview
•Layout of bzImage
•ELF layout
•setup.binand compressed vmlinux
•Physical memory layout
•Entry point of Linux –‘start_of_setup’@0x10200 (physical memory)
•From viewpoint of GRUB and QEMU loader
•Initialization flow
•Compressed vmlinux
•ELF layout
•Physical memory layout
•Initialization flow

Agenda
•Layout of bzImage
•ELF layout
•setup.binand compressed vmlinux
•Physical memory layout
•Entry point of Linux –‘start_of_setup’@0x10200 (physical memory)
•From viewpoint of GRUB and QEMU loader
•Initialization flow
•Compressed vmlinux
•ELF layout
•Physical memory layout
•Initialization flow
•CPU architecture knowledge
✓Near call and far call
✓Near jump and far jump
✓Instruction opcode
•CPUOperation Mode
✓Real mode, protected mode and long mode (64-bit mode)
➢Memory addressing
•ELF
✓Relocation, program header,…
•GNU assembly
RequisiteKnowledge

bzImage: High-level Overview (1/2)
Boot code
(Real mode -> protected mode)
Compressed vmlinux
Boot code
(protected mode + paging ->long mode)
vmlinux.bin.gz
bzImage

bzImage: High-level Overview (2/2)
Boot code
(Real mode -> protected mode)
Compressed vmlinux
Boot code
(protected mode + paging ->long mode)
vmlinux.bin.gz
bzImage
setup.bin
Compressed
vmlinux
(Protected-mode kernel)
CRC
bzImage

Layout of bzImage–setup.bin
setup.bin
Compressed
vmlinux
(Protected-mode kernel)
.bstext
CRC
.bsdata
Part 1 of ‘.header’
Part 2 of ‘.header’
.entrytext
Kernel Boot Section: 512 bytes (MBR)
Source : arch/x86/boot/header.S
<-arch/x86/boot/header.S
0x0
0x200Offset/SizeName Description
0x1F1/1setup_sectsThe size of the setup in sectors
0x01FE/2 boot_flagmagic number: 0xAA55
0x200/2jump Jump instruction
0x214/4code32_start
Boot loader hook: The address to jump to in protected mode.
Default: 0x100000
".header": Real-mode kernel header
0x1F1
Part 1 of ‘header’
Part 2 of ‘header’
.inittext
.initdata
.text
.text32
.rodata
.videocards
.data
.signature
.bss
<-arch/x86/boot/header.S
<-arch/x86/boot/header.S
<-arch/x86/boot/tty.c
<-arch/x86/boot/*.c
arch/x86/boot/bioscall.S
arch/x86/boot/copy.S
arch/x86/boot/ pmjump.S
<-arch/x86/boot/ pmjump.S
<-arch/x86/boot/*.c
<-arch/x86/boot/video-*.c
<-arch/x86/boot/*.c
<-4-byte signature
<-arch/x86/boot/*.c
bzImage
ELF sections

Layout of bzImage–setup.bin
setup.bin
Compressed
vmlinux
(Protected-mode kernel)
.bstext
CRC
.bsdata
Part 1 of ‘.header’
Part 2 of ‘.header’
.entrytext
Kernel Boot Section: 512 bytes (MBR)
Source : arch/x86/boot/header.S
<-arch/x86/boot/header.S
0x0
0x200Offset/SizeName Description
0x1F1/1setup_sectsThe size of the setup in sectors
0x01FE/2 boot_flagmagic number: 0xAA55
0x200/2jump Jump instruction
0x214/4code32_start
Boot loader hook: The address to jump to in protected mode.
Default: 0x100000
".header": Real-mode kernel header
0x1F1
Part 1 of ‘header’
Part 2 of ‘header’short jump
.inittext
.initdata
.text
.text32
.rodata
.videocards
.data
.signature
.bss
<-arch/x86/boot/header.S
<-arch/x86/boot/header.S
<-arch/x86/boot/tty.c
<-arch/x86/boot/*.c
arch/x86/boot/bioscall.S
arch/x86/boot/copy.S
arch/x86/boot/ pmjump.S
<-arch/x86/boot/ pmjump.S
<-arch/x86/boot/*.c
<-arch/x86/boot/video-*.c
<-arch/x86/boot/*.c
<-4-byte signature
<-arch/x86/boot/*.c
bzImage
near call
long jump
1
2
3
1CPU Real Mode (16 bits)
2CPU Real Mode
3CPU Real Mode -> CPU Protected Mode (32 bits)
ELF sections

Layout of bzImage–compressed vmlinux
setup.bin
(arch/x86/boot/setup.bin)
Compressed vmlinux
(Protected-mode kernel)
Note
ELF: arch/x86/boot/compressed/vmlinux
Binary: arch/x86/boot/vmlinux.bin
CRC
bzImage
vmlinux.bin
vmlinux.bin.gz
How to pack vmlinux.bin.gz?
arch/x86/boot/compressed
.head.text
.rodata..compressed
(vmlinux.bin.gz)
.text
.rodata
.data
arch/x86/boot/compressed/head_64.S
0x0
.bss
.pgtable
arch/x86/boot/compressed/piggy.Screated by arch/x86/boot/compressed/mkpiggy.c
arch/x86/boot/compressed/vmlinux.bin.gz
arch/x86/boot/compressed/*.c
arch/x86/boot/compressed/head_64.S
arch/x86/boot/compressed/efi_thunk_64.S
arch/x86/boot/compressed/head_64.S
ELF Sections

Layout of bzImage–compressed vmlinux
Compressed vmlinux
setup.bin
(arch/x86/boot/setup.bin)
Compressed vmlinux
(Protected-mode kernel)
Note
ELF: arch/x86/boot/compressed/vmlinux
Binary: arch/x86/boot/vmlinux.bin
CRC
bzImage
.head.text
.rodata..compressed
(vmlinux.bin.gz)
.text
.rodata
.data
arch/x86/boot/compressed/head_64.S
0x0
.bss
.pgtable
arch/x86/boot/compressed/piggy.Screated by arch/x86/boot/compressed/mkpiggy.c
arch/x86/boot/compressed/vmlinux.bin.gz
arch/x86/boot/compressed/*.c
arch/x86/boot/compressed/head_64.S
arch/x86/boot/compressed/efi_thunk_64.S
arch/x86/boot/compressed/head_64.S
ELF Sections

Layout of bzImage–compressed vmlinux.bin
* Symbol: Equivalent to using ‘.set’ directive
* https://sourceware.org/binutils/docs/as/Setting-Symbols.html
Why z_input_len/input and z_output_len/output_len?
* BFD: Binary File Descriptor library -https://www.gnu.org/software/binutils/

Memory layout of bzImage–Entry Point Address
Where is ‘X’?
BIOS use only
Typically used by MBR
Reserved for MBR/BIOS
Boot loader
0x00000
0x00600
0x00800
0x01000
Kernel boot section
stack/heap
X
X+0x08000
Reserved for BIOS
Command line
I/O memory hole
Protected-mode kernel
(Compressed vmlinux)
X+0x10000
0x100000
0xA0000
Boot sector entry point 0000:7C00
The kernel legacy boot sector
The kernel real-mode/protected mode code
For use by the kernel real-mode/protected mode code
Physical Memory
Kernel setup code
Reference: Documentation/x86/boot.rst

Entry Point of Linux -GRUB
Memory addressing in real mode
[GRUB] Get the memory address for real mode code
1.gs= fs = es = ds = ss = 0x1000
2.sp= GRUB_LINUX_SETUP_STACK = 0x9000
3.cs = 0x1020, ip= 0
Registers configured by GRUB
Kernel boot section
0x10000
0x10200
Physical Memory
GRUB loads ‘setup.bin’ at address 0x10000
0
ds = es = fs = gs= ss
cs
stack
ss:sp= 0x1FFF0
protected mode
real mode
Kernel setup
code

Entry Point of Linux -GRUB
Memory addressing in real mode
[GRUB] Get the memory address for real mode code
1.gs= fs = es = ds = ss = 0x1000
2.sp= GRUB_LINUX_SETUP_STACK = 0x9000
3.cs = 0x1020, ip= 0
Registers configured by GRUB
Kernel boot section
0x10000
0x10200
Physical Memory
GRUB loads ‘setup.bin’ at address 0x10000
0
ds = es = fs = gs= ss
cs
stack
ss:sp= 0x1FFF0
protected mode
real mode
Kernel setup
code
1.QEMU loader and GRUB load ‘setup.bin’ at address 0x10000
2.QEMU loader sets SS:SP = 1000:FFF0 while GRUB sets SS:SP 1000:9000

Entry Point of Linux: QEMU loader
Kernel boot section
0x10000
0x10200
Physical Memory
QEMU loader loads ‘setup.bin’ at address 0x10000
0
ds = es = fs = gs= ss
cs
stack
ss:sp= 0x1FFF0
protected mode
real mode
Kernel setup
code
1
2
3
4
5
6
7
ds = es = fs = gs= ss = segment_addr= 0x1000
esp= stack_addr= cmdline_addr-setup_addr–16 = 0x20000 –
0x10000 –16 = 0x10000 –16 = 0xfff0
cs = 0x1020, ip= 0
Registers configured by QEMU loader
5
6
7
Prepare for far return
8
far return: change ‘cs’ by means of
CPU arch itself

Entry Point of Linux: QEMU loader –Near and Far calls
3
4
5
6
7Prepare for far return
8
far return: change ‘cs’ by means of
CPU arch itself

Entry Point of Linux: QEMU loader
Kernel boot section
0x10000
0x10200
Physical Memory
QEMU loader loads ‘setup.bin’ at address 0x10000
0
ds = es = fs = gs= ss
cs
stack
sp= 0x1FFF0 (ss:0xFFF0)
protected mode
real mode
Kernel setup
code
Make sure setup.binis loaded at 0x10000
Make sure vmlinux.binis loaded at 0x100000
Address of setup.bin
Address of vmlinux.bin

arch/x86/boot/setup.ld
arch/x86/boot/header.S
1
2
Entry Point of Linux: GNU Linker
[GNU Linker] ENTRY() command
* First executable instruction in an output file →entry point
* ENTRY() is one of choosing the entry point
--the `-e' entry command-line option
--the ENTRY(symbol) command in a linker control script
--the value of the symbol start, if present
--the address of the first byte of the .text section, if present;
--the address 0

arch/x86/boot/setup.ld
1
Entry Point of Linux: GNU Linker
[GNU Linker] ENTRY() command
* First executable instruction in an output file →entry point
* ENTRY() is one of choosing the entry point
--the `-e' entry command-line option
--the ENTRY(symbol) command in a linker control script
--the value of the symbol start, if present
--the address of the first byte of the .text section, if present;
--the address 0
Kernel boot section
0x10000
0x10200
Physical Memory
QEMU loader loads ‘setup.bin’ at address 0x10000
0
ds = es = fs = gs= ss
cs
stack
sp= 0x1FFF0 (ss:0xFFF0)
protected mode
real mode
Kernel setup
code

Entry Point of Linux: start_of_setup-GDB
Kernel boot section
0x10000
0x10200
Physical Memory
Boot loader loads ‘setup.bin’ at address 0x10000
0
gs= fs = es = ds = ss
cs
stack
sp= 0x1FFF0 (ss:0xFFF0)
protected mode
real mode
Kernel setup
code

Entry Point of Linux: start_of_setup-GDB
Kernel boot section
0x10000
0x10200
Physical Memory
Boot loader loads ‘setup.bin’ at address 0x10000
0
gs= fs = es = ds = ss
cs
stack
sp= 0x1FFF0 (ss:0xFFF0)
protected mode
real mode
Kernel setup
code

Entry Point of Linux: start_of_setup–short jump
Kernel boot section
0x10000
0x10200
Physical Memory
Boot loader loads ‘setup.bin’ at address 0x10000
0
gs= fs = es = ds = ss
cs
stack
sp= 0x1FFF0 (ss:0xFFF0)
protected mode
real mode
Kernel setup
codeOffset/SizeName Description
0x1F1/1setup_sectsThe size of the setup in sectors
0x01FE/2 boot_flagmagic number: 0xAA55
0x200/2jump Jump instruction
0x214/4code32_start
Boot loader hook: The address to jump to in protected mode.
Default: 0x100000
".header": Real-mode kernel header

Entry Point of Linux: start_of_setup–short jump
0x26c –0x202 = 0x6a

Entry Point of Linux: start_of_setup
Call Path
Kernel boot section
0x10000
0x10200
Boot loader loads ‘setup.bin’ at address 0x10000
0
gs= fs = es = ds = ss = cs
cs
stack
sp= 0x1FFF0 (ss:0xFFF0)
protected mode
real mode
Kernel setup
code
Physical Memory
lretwinstruction: Far Return Operation
‘l’ prefix: far control transfer
‘w’ suffix: word (16 bits)

Entry Point of Linux: start_of_setup
Call Path
Kernel boot section
0x10000
0x10200
Boot loader loads ‘setup.bin’ at address 0x10000
0
gs= fs = es = ds = ss = cs
cs
stack
sp= 0x1FFF0 (ss:0xFFF0)
protected mode
real mode
Kernel setup
code
Physical Memory
lretwinstruction: Far Return Operation
‘l’ prefix: far control transfer
‘w’ suffix: word (16 bits)
1
1
2
2
3
3

Entry Point of Linux: start_of_setup
Kernel boot section
0x10000
0x10200
Boot loader loads ‘setup.bin’ at address 0x10000
0
gs= fs = es = ds = ss = cs
cs
stack
sp= 0x1FFF0 (ss:0xFFF0)
protected mode
real mode
Kernel setup
code
Physical Memory
Call Path
lretwinstruction: Far Return Operation
‘l’ prefix: far control transfer
‘w’ suffix: word (16 bits)

Entry Point of Linux: start_of_setup–Why to align CS?
Kernel boot section
0x10000
0x10200
Boot loader loads ‘setup.bin’ at address 0x10000
0
gs= fs = es = ds = ss = cs
cs
stack
sp= 0x1FFF0 (ss:0xFFF0)
protected mode
real mode
Kernel setup
code
Physical Memory
Call Path
lretwinstruction: Far Return Operation
‘l’ prefix: far control transfer
‘w’ suffix: word (16 bits)
If cs is not align with ds, ds and es are incorrect
after returning from ‘intcall’.

Entry Point of Linux: start_of_setup–data & bsssection
Call Path
Kernel boot section
0x10000
0x10200
Boot loader loads ‘setup.bin’ at address 0x10000
0
gs= fs = es = ds
= ss= cs
stack
sp= 0x1FFF0
protected mode
real mode
Kernel setup
code
Kernel boot section
0x10000
0x10200
0
gs= fs = es = ds
= ss = cs
stack
sp= 0x1FFF0
protected mode
real mode
Kernel setup
code
BSS Section
_end
__bss_start
__bss_end
Data Section
Physical Memory

Entry Point of Linux: start_of_setup-> main()
Call Path

Entry Point of Linux: start_of_setup-> main()

Entry Point of Linux: start_of_setup-> main()

Entry Point of Linux: start_of_setup-> main() -> copy_boot_params()
Call Path
•copy setup header into boot parameter block(struct boot_params:
arch/x86/include/uapi/asm/bootparam.h)
o`struct setup_headerhdr` in boot_params
▪Contain the same fields defined in Linux boot protocol. Those fields are
configured by boot loader and kernel compile/build time

Call Path •console_init()
oInitialize the corresponding serial port if command line has ‘earlyprintk’
parameter
Entry Point of Linux: start_of_setup-> main() -> console_init() –(1/2)
Kernel boot section
0x10000
0x10200
0
gs= fs = es = ds
= ss = cs
stack
sp= 0x1FFF0
protected mode
real mode
Kernel setup
code
BSS Section
_end
__bss_start
__bss_end
Data Section
Kernel Command Line
0x20000
QEMU Loader
Physical Memory

Call Path •console_init()
oInitialize the corresponding serial port if command line has ‘earlyprintk’
parameter
Entry Point of Linux: start_of_setup-> main() -> console_init() –(2/2)
Kernel boot section
0x10000
0x10200
0
gs= fs = es = ds
= ss = cs
stack
sp= 0x1FFF0
protected mode
real mode
Kernel setup
code
BSS Section
_end
__bss_start
__bss_end
Data Section
Kernel Command Line
0x20000
Physical Memory

Call Path •init_heap()
•Discussion in the next few slides
•validate_cpu()
oCheck CPU flags
oCheck if long mode (x86_64) is available
o[AMD –K7 Processor] Turn SSE+SSE2 on if they are missing in CPU
flags
•detect_memory()
oUse different program interfaces (0xe820, 0xe801 and 0x88) for memory
detection
o0xe820
▪Fill boot_params.e820_table based on e820 map
Entry Point of Linux: start_of_setup-> main()-> validate_cpu() & detect_memory()
Kernel boot section
0x10000
0x10200
0
gs= fs = es = ds
= ss = cs
stack
sp= 0x1FFF0
protected mode
real mode
Kernel setup
code
BSS Section
_end
__bss_start
__bss_end
Data Section
Kernel Command Line
0x20000
Physical Memory

Call Path
•init_heap
oSetup the heap space if the ‘CAN_USE_HEAP’ flag (0x80) is set in loadflags
of the kernel setup header.
Entry Point of Linux: start_of_setup-> main() -> init_heap() (1/2)

Call Path
Entry Point of Linux: start_of_setup-> main() -> init_heap() (2/2)
heap: allocate heap if CAN_USE_HEAP’ flag (0x80) is set No heap
sp(STACK_SIZE = 0x400)
Kernel boot section
0x10000
0x10200
stack
protected mode
real mode
Kernel setup
code
BSS Section
Unused Area
__bss_start
__bss_endHEAP = heap_end= _end
Data Section
sp(STACK_SIZE = 0x400)
Kernel boot section
0x10000
0x10200
stack
protected mode
real mode
Kernel setup
code
BSS Section
Heap
__bss_start
__bss_endHEAP = _end
heap_end
Data Section
gs= fs = es = ds = ss = csgs= fs = es = ds = ss = cs

go_to_protected_mode
GDT_ENTRY_BOOT_DS
GDT_ENTRY_BOOT_CS
NULL
NULL
0
1
2
3
GDT_ENTRY_BOOT_TSS4
Descriptor Table: boot_gdt
System
Memory
0
0xFFFFFFFF
limitBase Address
GDTR
x86 Segmentation: Address Translation
setup_gdt(): Setup 4G memory space for CS/DS
Call Path

protected_mode_jump(1/6)

protected_mode_jump–ljmplinstruction: ignore ‘.Lin_pm32’ relocation (2/6)
0x30cc
Jump (absolute address) to the wrong location
sp(STACK_SIZE = 0x400)
Kernel boot section
0x10000
0x10200
stack
protected mode
real mode
Kernel setup
code
BSS Section
Heap
__bss_start
__bss_endHEAP = _end
heap_end
Data Section
gs= fs = es = ds = ss = cs
setup.bingeneration
Physical Memory

protected_mode_jump–ljmplinstruction -relocation (3/6)
sp(STACK_SIZE = 0x400)
Kernel boot section
0x10000
0x10200
stack
protected mode
real mode
Kernel setup
code
BSS Section
Heap
__bss_start
__bss_endHEAP = _end
heap_end
Data Section
gs= fs = es = ds = ss = cs
Relocation for absolute address of ‘ljmpl’
ljmpl
Physical Memory

Relocation for absolute address of ‘ljmpl’
protected_mode_jump–ljmplinstruction (4/6)
sp(STACK_SIZE = 0x400)
Kernel boot section
0x10000
0x10200
stack
protected mode
real mode
Kernel setup
code
BSS Section
Heap
__bss_start
__bss_endHEAP = _end
heap_end
Data Section
gs= fs = es = ds = ss = cs
ljmpl
Physical Memory

protected_mode_jump–ljmplinstruction: instruction format (5/6)

protected_mode_jump–ljmplinstruction: instruction format (6/6)

Protected mode: ‘.Lin_pm32’ (1/2)
[real mode] SP configuration
[protected mode] SP configuration
`addl%ebx, %esp` in label “.Lin_pm32”
0x1FF80 (SS:SP = 0x1000:0xFF80)
Kernel boot section
0x10000
0x10200
stack
protected mode
real mode
Kernel setup
code
BSS Section
Heap
__bss_start
__bss_endHEAP = _end
heap_end
esp= 0x1FF80
Kernel boot section
0x10000 (ebx)
0x10200
stack
protected mode
real mode
Kernel setup
code
BSS Section
Heap
__bss_start
__bss_endHEAP = _end
heap_end
Data Section Data Section

Data Section
1
2
4
3
Protected mode: ‘.Lin_pm32’ (2/2)
X = 0x10000
esp= 0x1FF80
Kernel boot section
0x10200
stack
protected mode
real mode
Kernel setup
code
BSS Section
Heap
__bss_start
__bss_end
HEAP = _end
heap_end
Reserved for BIOS command line
I/O memory hole
Protected-mode kernel code
(compressed vmlinux)
X+0x10000
0xA0000
0x100000
jmpl*%eax5
Physical Memory
Call Path

Compressed vmlinux: memory layout (1/10)
.head.text–startup_32
0x100000 (ebpregister)
0x100200
decompressed vmlinux.bin.bz
.head.text–startup_64
0x1000000
compressed vmlinux
(Relocation)
0x1000000 + boot_param.init_size
0x1000000 + boot_param.init_size
-_end (rbxregister)
vmlinux.bin.gz
.text
.rodata
.data
.bss
.pgtable
_end0x100000 + _end
boot_heap(size: 0x10000)
boot_stack(size: 0x4000)

input_data
input_data_end
Memory Layout
32-bit entry point
_bss

Compressed vmlinux: boot_stack& boot_heapin .bss(2/10)
.head.text–startup_32
0x100000 (ebpregister)
0x100200
decompressed vmlinux.bin.bz
.head.text–startup_64
0x1000000
compressed vmlinux
(Relocation)
0x1000000 + boot_param.init_size
0x1000000 + boot_param.init_size
-_end (rbxregister)
vmlinux.bin.gz
.text
.rodata
.data
.bss
.pgtable
_end0x100000 + _end
boot_heap(size: 0x10000)
boot_stack(size: 0x4000)

input_data
input_data_end
Memory Layout
32-bit entry point
_bss

Compressed vmlinux: High-level Overview (3/10)
Why relocation
•Base address of 32-bit Linux kernel entry point: 0x100000
•Default base address of Linux kernel:
CONFIG_PHYSICAL_START=0x1000000
•Use Case
•kdump: a recuse kernel is loaded to a different address
•PIE (Position independent Executable) and PIC (Position
Independent Code)

Compressed vmlinux: startup_32: 32-bit entry point (4/10)
1
1

Compressed vmlinux: startup_32 (5/10)
1
1
Get the loading address

2
2
Compressed vmlinux: startup_32 (6/10)

Compressed vmlinux: startup_32: Init 4-level page table (7/10)
Sign-extend
Page Map
Level-4 Offset
Page Directory
Pointer Offset
Page Directory
Offset
Physical Page Offset
030 2139 2038 29474863
PML4E #0
PDPTE #3
Data
Page Map
Level-4 Table
Page Directory
Pointer Table
Page Directory
Table
40
9 9 9
Linear Address
CR3
PDPTE #2
PDPTE #1
PDPTE #0
PDE #1535
PDE #1024
.
.
PDE #2047
PDE #1536
.
.
PDE #511
PDE #0
.
.
PDE #1023
PDE #512
.
.
2MBbyte
Physical
Page
40
40
31
21
[Paging] Identity mapping for 0-4GB memory space

Compressed vmlinux: startup_32: Init 4-level page table (8/10)
Reference: Section 4.1 “PAGING MODES AND CONTROL BITS”, Intel® 64 and IA-32 Architectures Software Developer’s Manual
Volume 3 (3A, 3B, 3C & 3D): System Programming Guide

Compressed vmlinux: startup_32: Init 4-level page table (9/10)

Compressed vmlinux: far return to startup_64 (10/10)
rva(startup_64) = 0x200
ebp= 0x100000
eax= 0x100000 + 0x200 = 0x100200

Compressed vmlinux: startup_64
2
3
Why to reload CS? (Commit “34bb49229f19”)
When the pre-decompression code loads its first GDT in startup_64, it is still
running on the CS value of the previous GDT. In the case of SEV-ES this is the EFI
GDT. It can be anything depending on what has loaded the kernel (EFI, legacy boot
code, container runtime, etc.)

Compressed vmlinux: [.text] .Lrelocated(1/5)
4
5
Why to call initialize_identity_maps()?

Compressed vmlinux: [.text] .Lrelocated(2/5)
4
5
Why tomap boot_paramsand command line?

Compressed vmlinux: parse_elf(3/5)
4
ELF Header
0x1000000
decompressed vmlinux.bin.bz
(vmlinux.bin–ELF format)
program headers
program header #0
(.text, .rodata, .pci_fixup….)
0x1200000
program header #1
(.data .vvar)
program header #2
(.init.text.altinstr_aux…)
0x1a00000
0x1ac2000
program header #3 (.notes)
0x18886b0
0x1000000
program header #0
(.text, .rodata, .pci_fixup….)
0x1800000
program header #1
(.data .vvar)
program header #2
(.init.text.altinstr_aux…)
0x18c2000
Physical memory Physical memory

Compressed vmlinux: handle_relocations(4/5)
4
CONFIG_RELOCATABLE
•Retain relocation information (generate .rel.* or rela.* sections) when
building a kernel image, so it can be loaded someplace besides the default
address (CONFIG_PHYSICAL_START = 16MB).
•Use case: kdumpkernel (recovery kernel)
handle_relocations() -Relocation if CONFIG_X86_NEED_RELOCS is set
•Depend on RANDOMIZE_BASE || (X86_32 && RELOCATABLE)
•Scan relocation tables (.rel.* or .rela.* sections) for symbol relocation

Compressed vmlinux: handle_relocations(5/5)
4
vmlinux.bin.bz
vmlinux.bin
vmlinux.relocs
handle_relocations():
Perform relocation
backwards from the end
of the decompressed
vmlinux
64-bit relocation
address
0
32-bit relocation
address
0
-R section_name: Remove any section matching section_name
-S or strip-all: Do not copy relocation and symbol information from the source file
objdumpoptions

Recap
setup.bin
(arch/x86/boot/setup.bin)
Compressed vmlinux
(Protected-mode kernel)
Note
ELF: arch/x86/boot/compressed/vmlinux
Binary: arch/x86/boot/vmlinux.bin
CRC
bzImage

[More info] bzImage= vmlinuz
On a physical machine
Source code: arch/x86/boot/Makefile, arch/x86/boot/install.sh

Reference
•The Linux/x86 Boot Protocol, Documentation/x86/boot.rst
•Intel® 64 and IA-32 Architectures Software Developer’s Manual
•https://wdv4758h.github.io/notes/blog/linux-kernel-boot.html
•Linux insides, https://0xax.gitbooks.io/linux-insides/content/

Appendix

gdb: Preparation for debugging real-mode of Linux kernel (1/2)
Github: https://github.com/AdrianHuang/gdb-linux-real-mode

gdb: Preparation for debugging real-mode of Linux kernel (2/2)
Github: https://github.com/AdrianHuang/gdb-linux-real-mode

initialize_identity_maps
x86_mapping_info
void *(*alloc_pgt_page)(void *)
void *context
unsigned long page_flag
unsigned long offset
alloc_pgt_data
unsigned char *pgt_buf
unsigned long pgt_buf_size
unsigned long pgt_buf_offset
bool direct_gbpages
unsigned long kernpg_flag

UEFI booting flow –EFI boot stub: Entry point
AddressOfEntryPoint(efi_pe_entry): 0x18d84a
ImageBase= 0x1000000
Physical address of AddressofEntryPoint= 0x1000000 +
0x18d84a = 0x118d84a

UEFI booting flow –EFI Handover protocol

UEFI booting flow –EFI Handover protocol

UEFI booting flow –EFI Handover protocol
Where is the address of bzimageloaded by boot loader?

UEFI booting:call path