Trap Handling in Linux

55
Trap Handling in Linux focusing on system call Yongrae Jo 2017. 2. 16

Transcript of Trap Handling in Linux

Trap Handling in Linuxfocusing on system call

Yongrae Jo

2017. 2. 16

2

CONTENTS

Background

Function Call Flow from start_kernel()

IDT initialization & Its Data Structure(gate, idt_table, MSR)

syscall entry, fast vs slow path, sys_call_table Initialization

system call procedure from user application and glibc

3

Background

InterruptExternal Interrupt

Asynchronous InterruptIRQ

TrapException

FaultSystem Call

Internal InterruptSynchronous Interrupt

Hardware Interrupt Software Interrupt

But in Linux, Software Interrupt are all regarded as Trap

An interrupt is a signal from a device attached to a computer or from a program within the computer that requires the operating system to stop and figure out what to do next (from whatis.techtarget.com/)

4

Execution Flow of Interrupt Service

Normal Execution

Interrupt Triggered,

Non-Maskable Interrupt(NMI)

1. Save current State2. Call Handler Routine

But it is masked

Execute Requested Handler Routine

1. Restore state2. Return from ISR

Source : Image from http://studymake.tistory.com/341

5

(External)Interrupt Controller

source : http://embien.com/blog/interrupt-handling-in-embedded-software/

6

CONTENTS

Background

Function Call Flow from start_kernel()

IDT initialization & Its Data Structure(gate, idt_table, MSR)

syscall entry & sys_call_table Initialization &

system call procedure from user application and glibc

7

Function Call Flow from start_kernel()

8

Function Call Flow from start_kernel()

9

CONTENTS

Background

Function Call Flow from start_kernel()

IDT initialization & Its Data Structure(gate, idt_table, MSR)

syscall entry, fast vs slow path, sys_call_table Initialization

system call procedure from user application and glibc

10

trap_init() from/usr/src/linux-4.9.6/arch/x86/kernel/traps.c

11

trap_init()/usr/src/linux-4.9.6/arch/x86/kernel/traps.c

12

List of interrupt from/usr/src/linux-4.9.6/arch/x86/include/asm/traps.h

13

x86’s Interrupt Descriptor Table

Source : Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A:, System Programming Guide, Part 1

14

x86’s Interrupt Descriptor Table(cont’d)

Source : Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A:, System Programming Guide, Part 1

15

x86’s Interrupt Descriptor Table(cont’d)

Where are the other interrupts defined?

16

List of interrupt from: /usr/src/linux-4.9.6/arch/x86/include/asm/irq_vectors.h

In this file, We can see the other interrupt vector names and its numbers

17

Let’s follow set_intr_gate() function call flow

trap_init()

18

Let’s follow set_intr_gate()’s function call flow

This function is real deal!

trap_init()

19

What is gate?

…The architecture also defines a set of special descriptors called gates (call gates, interrupt gates, trap gates, and task gates). These provide protected gateways to system procedures and handlers that may operate at a different privilege level than application programs and most procedures. For example, a CALL to a call gate can provide access to a procedure in a code segment that is at the same or a numerically lower privilege level (more privileged) than the current code segment.

Source : Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A:, System Programming Guide, Part 1

20

_set_gate() from : /usr/src/linux-4.9.6/arch/x86/include/asm/desc.h

We need to know the meaning of following terms “gate_desc”, “type”,“dpl”, “ist”, “seg” and “idt_table”

21

gate_desc from: /usr/src/linux-4.9.6/arch/x86/include/asm/desc_defs.h

Bit fieldsgate_desc in 64 bits

22

gate_struct64 and its connection to x86’s feature

Source : Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A:, System Programming Guide, Part 1

23

type field in gate_struct64

Source : Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A:, System Programming Guide, Part 1

24

write_idt_entry() from: /usr/src/linux-4.9.6/arch/x86/include/asm/desc.h

25

Let’s see cpu_init()

26

cpu_init(): /usr/src/linux-4.9.6/arch/x86/kernel/cpu/common.c

27

load_current_idt() function call flow

Inline assembly,load idt instruction

28

GCC’s Inline Assembly for x86 in Linux

Load idt nth parameter

Input operands :Memory constraints

C expression memory address

Source : https://www.ibm.com/developerworks/library/l-ia/

Source : https://www.ibm.com/developerworks/library/l-ia/

29

syscall_init() from: /usr/src/linux-4.9.6/arch/x86/kernel/cpu/common.c

30

MSR Flags from: /usr/src/linux-4.9.6/arch/x86/include/asm/msr-index.h

Where does these address come from?

31

What is MSR?A model-specific register (MSR) is any of various control registers in the x86 instruction set used for debugging, program execution tracing, computer performance monitoring, and toggling certain CPU features.(wikipedia)

Model

Source : Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C:, System Programming Guide, Part 3

32

Some MSRs

Source : Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C:, System Programming Guide, Part 3

MSR Register Address(hex/Decimal)

Architectural MSR Name and bit fields

MSR/Bit Description

33

Register syscall entry to MSRs: /usr/src/linux-4.9.6/arch/x86/kernel/cpu/common.c

segment

x86 -Assembly procedure

34

CONTENTS

Background

Function Call Flow from start_kernel()

IDT initialization & Its Data Structure(gate, idt_table, MSR)

syscall entry, fast vs slow path, sys_call_table Initialization

system call procedure from user application and glibc

35

syscall_init() has external references

System.map is a symbol table which contains memory address, type and its name. Here “t or T” means code(or text) section

Register entry address to MSR register

36

More details on entry_SYSCALL_64: /usr/src/linux-4.9.6/arch/x86/entry/entry_64.S

:

37

More details on entry_SYSCALL_64: /usr/src/linux-4.9.6/arch/x86/arch/entry/entry_64.S

:

38

System call has two types: fast vs slow(in entry_64.S)

39

System call has two types: fast vs slow(in entry_64.S)

Invoke adequate system call

40

Fast vs slow system call

A fast syscall is one that is known to be able to complete without blocking or waiting. When the kernel encounters a fast syscall, it knows it can execute the syscall immediately and keep the same process scheduled (e.g. getuid(), getpid(), gettimeofday(), ...)

A slow syscall potentially requires waiting for another task to complete, so the kernel must prepare to pause the calling process and run another task.(e.g. sleep(), possibly read())

Source : http://unix.stackexchange.com/questions/14293/difference-between-slow-system-calls-and-fast-system-calls

41

sys_call_table() : array where system call are listed

sys_call_table is an array of function pointer named sys_call_ptr_t which points to address of system call function and it takes 6 arguments and returns long type value

42

Initializing sys_call_table: /usr/src/linux-4.9.6/arch/x86/arch/syscall_64.c

Init doing nothing function

Declaration of system call functions

Assign system call function’s address to sys_call_table array using nr as an index

43

Wait a second

44

Wait a second

45

Designated InitializersStandard C90 requires the elements of an initializer to appear in a fixed order, the same as the order of the elements in the array or structure being initialized.

In ISO C99 you can give the elements in any order, specifying the array indices or structure field names they apply to, ...

To specify an array index, write ‘[index] =’ before the element value. For example,

int a[6] = { [4] = 29, [2] = 15 };is equivalent to int a[6] = { 0, 0, 15, 0, 29, 0 };

To initialize a range of elements to the same value, write ‘[first ... last] = value’. This is a GNU extension. For example,

int widths[] = { [0 ... 9] = 1, [10 ... 99] = 2, [100] = 3 };

Source : https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html

46

C Preprocessor’s #, ## Macro

#define STRING(x) #x means “x” : stringfy x by “x”

#define X(n) x##n means xn : concatenation with x and n

So let me roll down initialization code of sys_call_table array For example, index 0→

[0] = __SYSCALL_64_QUAL_##qual(sys_read)= __SYSCALL_64_QUAL_(sys_read) : (## is concatenation

and qual is empty)= sys_read

47

do_syscall64() from: /usr/src/linux-4.9.6/arch/x86/entry/common.c

48

do_syscall64() from: /usr/src/linux-4.9.6/arch/x86/entry/common.c

It invokes system call with arguments

548~0

These registers are constructed fromentry_64.S

49

do_syscall uses registers constructed at entry_SYSCALL_64

50

CONTENTS

Background

Function Call Flow from start_kernel()

IDT initialization & Its Data Structure(gate, idt_table, MSR)

syscall entry, fast vs slow path, sys_call_table Initialization

system call procedure from user application and glibc

51

syscall from Linux Programmer’s Manuel

syscall() is a small library function that invokes the system call whose assembly language interface has the specified number with the specified arguments. Architecture calling convention

Old!

New!

52

System call from user application

Assemble : gcc -S sys_mult.c

syscall in glibc

Intel x86-64 Instruction

53

More on syscall instruction in x86

Intel x86-64 Instruction

SYSCALL invokes an OS system-call handler at privilege level 0. It does so by loading RIP from the IA32_LSTAR MSR (after saving the address of the instruction following SYSCALL into RCX). (The WRMSR instruction ensures that the IA32_LSTAR MSR always contain a canonical address.)

syscall_init() from Page 29

MSRs From page 32

Source : Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2: Instruction Set Reference

54

System Call Architecture with glibc

Source : https://ko.wikipedia.org/wiki/GNU_C_라이브러리

To understand the actual process of system call from application level to kernel level, you have to know additional functions in glibc(https://www.gnu.org/s/libc/)

Many other functions...I’ll cover these later if possible

55

Q & A