1 Linux Operating System 許 富 皓. 2 Processes Switch.

36
1 Linux Operating System 許 許 許

Transcript of 1 Linux Operating System 許 富 皓. 2 Processes Switch.

Page 1: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

1

Linux Operating System

許 富 皓

Page 2: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

2

Processes Switch

Page 3: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

3

switch_to Macro

Assumptions: local variable prev refers to the process

descriptor of the process being switched out. next refers to the one being switched in to

replace it. switch_to(prev,next,last) macro:

First of all, the macro has three parameters called prev, next, and last.

The actual invocation of the macro in context_switch() is: switch_to(prev, next, prev).

In any process switch, three processes are involved, not just two.

Page 4: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

4

Why 3 Processes Are Involved in a Context Switch?

:

prev = A

next=B

Kernel Mode Stack of Process A

:

prev =

next=

Kernel Mode Stack of Process B

:

prev = C

next= A

Kernel Mode Stack of Process C

:

prev =

next=

Kernel Mode Stack of Process D

Where is C ? ……….

………..code of

switch_to

Here old process is suspended. New process

resumes.frontrear

Page 5: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

5

Why Reference to C Is Needed?

To complete the process switching.P.S.: See Chapter 7, Process Scheduling, for

more details.

Page 6: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

6

The last Parameter (F) Before the process switching, the macro saves in the eax CPU

register the content of the variable identified by the first input parameter prev -- that is, the prev local variable allocated on the Kernel Mode stack of C.

(R) After the process switching, when A has resumed its execution, the macro writes the content of the eax CPU register in the memory location of A identified by the third output parameter last(=prev). (R) The last parameter of the switch_to macro is an output parameter

that specifies a memory location in which the macro writes the descriptor address of process C (of course, this is done after A resumes its execution).

(R) In the current implementation of schedule( ), the last parameter identifies the prev local variable of A, so prev is overwritten with the address of C.

(R) Because the CPU register doesn't change across the process switch, this memory location receives the address of C's descriptor.

P.S.: (F) means the front part of switch_to (R) means the rear part of switch_to

Page 7: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

7

:

prev = C

next=

current execution

Code Execution Sequence & Get the Correct Previous Process Descriptor

:

prev = A

next=B

Kernel Mode Stack of Process A

:

prev =

next=

Kernel Mode Stack of Process B

:

prev = D

next=

Kernel Mode Stack of Process C

:

prev =

next=

Kernel Mode Stack of Process D

……….movl 484(%edx),%espmovl $1f, 480(%eax)

code of switch_to

front

rear

%eax =prevprev= %eax

prev = C

code of switch_to

prev = C

previous execution movl $1f, 480(%eax)

:

Page 8: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

8

From schedule to switch_to

schedule()

__schedule()

context_switch()

switch_to

Page 9: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

9

Simplification for Explanation

The switch_to macro is coded in extended inline assembly language that makes for rather complex reading.

In fact, the code refers to registers by means of a special positional notation that allows the compiler to freely choose the general-purpose registers to be used.

Rather than follow the extended inline assembly language, we'll describe what the switch_to macro typically does on an 80x86 microprocessor by using standard assembly language.

Page 10: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

10

switch_to (1) Saves the values of prev and next in

the eax and edx registers, respectively:   movl prev,%eax   

movl next,%edx

The eax and edx registers correspond to the prev and next parameters of the macro.

Page 11: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

11

switch_to (2)

Saves the contents of the eflags and ebp registers in the prev Kernel Mode stack.

They must be saved because the compiler assumes that they will stay unchanged until the end of switch_to :

pushfl

pushl %ebp

Page 12: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

12

switch_to (3) Saves the content of esp in prev->thread.sp

so that the field points to the top of the prev Kernel Mode stack:

movl %esp,484(%eax)

The 484(%eax) operand identifies the memory cell whose address is the contents of eax plus 484.

Page 13: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

13

switch_to (4) Loads next->thread.sp in esp. From now

on, the kernel operates on the Kernel Mode stack of next, so this instruction performs the actual process switch from prev to next.

Because the address of a process descriptor is closely related to that of the Kernel Mode stack, changing the kernel stack means changing the current process:

movl 484(%edx), %esp

Page 14: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

14

Saves the address labeled 1 (shown later in this section) in prev->thread.ip.

When the process being replaced resumes its execution, the process executes the instruction labeled as 1:

movl $1f, 480(%eax)

switch_to (5)

Page 15: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

15

On the Kernel Mode stack of next, the macro pushes the next->thread.ip value, which, in most cases, is the address labeled as 1:

pushl 480(%edx)

switch_to (6)

Page 16: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

16

Jumps to the __switch_to( ) C function:P.S.: see next.

jmp __switch_to

switch_to (7)

Page 17: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

17

:

:

eflag

ebp

lable 1

:

:

eflag

ebp

Graphic Explanation of the Front Part of switch_to

:

:

sp=oxyyyyyyyy

ip=label 1

struct

thread_struct

process descriptor

kernel mode stack

0xyyyyyyyy

prev

0xzzzzzzzz

next

:

:

:

sp= 0xzzzzzzzz

ip=label 1

process descriptor

kernel mode stack

esp

Page 18: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

18

__switch_to

Page 19: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

19

The __switch_to( ) function

The __switch_to( ) function does the bulk of the process switch started by the switch_to( ) macro.

It acts on the prev_p and next_p parameters that denote the former process (e.g. process C of slide 7) and the new process (e.g. process A of slide 7).

This function call is different from the average function call, though, because __switch_to( ) takes the prev_p and next_p parameters from the eax and edx registers (where we saw they were stored), not from the stack like most functions.

Page 20: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

Before and Including Linux 2.6.24

20

Page 21: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

21

Get Function Parameters from Registers

To force the function to go to the registers for its parameters, the kernel uses the __attribute__ and regparm keywords, which are nonstandard extensions of the C language implemented by the gcc compiler.

Page 22: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

22

regparm

regparm (number) On the Intel 386, the regparm attribute causes the

compiler to pass up to number integer arguments in registers EAX, EDX, and ECX instead of on the stack.

Functions that take a variable number of arguments will continue to be passed all of their arguments on the stack.

Page 23: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

23

Function Prototype of __switch_to( ) The __switch_to( ) function is

declared in the include/asm-i386/system.h header file as follows:

__switch_to(struct task_struct *prev_p, struct

task_struct * next_p) __attribute__(regparm(3));

Page 24: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

After and Including Linux 2.6.25

24

Page 25: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

Function Prototype of __switch_to( )[ 何春晖]

__notrace_funcgraph struct task_struct * __switch_to(struct task_struct *prev_p, struct task_struct *next_p)

Makefile, /arch/x86/Makefile, instructs complier to utilize registers to pass parameters.

25

Page 26: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

26

__switch_to( ) (1)

Executes the smp_processor_id( ) macro to get the index of the local CPU, namely the CPU that executes the code.

The macro gets the index from the cpu field of the thread_info structure of the current process

andstores it into the cpu local variable.

Page 27: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

27

Loads next_p->thread.sp0 into the sp0[1][

2] field of the TSS relative to the local CPU; as we'll see in the section "Issuing a System Call via the sysenter Instruction", any future privilege level change from User Mode to Kernel Mode raised by a sysenter assembly instruction will copy this address into the esp register:

tss->x86_tss.sp0 = thread->sp0;

P.S. When a process is created, function copy_thread() set the sp0 field to point the next byte of the last byte of the kernel mode stack of the new born process.

__switch_to( ) (2)

Page 28: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

28

__switch_to( ) (3) Loads in the Global Descriptor Table of the local CPU

the Thread-Local Storage (TLS) segments used by the next_p process.

The above three Segment Selectors are stored in the tls_array array inside the process descriptor. P.S.: See the section "Segmentation in Linux" in Chapter 2.

#define GDT_ENTRY_TLS_MIN 6

per_cpu(gdt_page, cpu).gdt[GDT_ENTRY_TLS_MIN + 0] = next_p->tls_array[0];per_cpu(gdt_page, cpu).gdt[GDT_ENTRY_TLS_MIN + 1] = next_p->tls_array[1];per_cpu(gdt_page, cpu).gdt[GDT_ENTRY_TLS_MIN + 2] = next_p->tls_array[2];

Page 29: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

29

__switch_to( ) (4) Stores the contents of the gs segmentation

registers in prev_p->thread.gs; #define savesegment(seg, value) \ asm("mov %%" #seg ",%0":"=r" (value) : : "memory")

#define lazy_save_gs(v) savesegment(gs, (v))

__switch_to(…){ …struct thread_struct *prev = &prev_p->thread; …lazy_save_gs(prev->gs); …}

Page 30: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

30

If the the gs segmentation register have been used either by the prev_p or by the next_p process (having nonzero values), loads into gs the value stored in the thread_struct descriptor of the next_p process.

__switch_to(…){ …struct thread_struct *next = &next_p->thread; … if (prev->gs | next->gs) lazy_load_gs(next->gs); …}

__switch_to( ) (5)

Page 31: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

31

__switch_to( ) (6) Updates the I/O bitmap in the TSS, if

necessary. :

void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,

struct tss_struct *tss)

{ struct thread_struct *prev, *next;

prev = &prev_p->thread;

next = &next_p->thread;

...

if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP))

{

memcpy(tss->io_bitmap, next->io_bitmap_ptr, max(prev->io_bitmap_max, next->

io_bitmap_max));

} else

...

}

Page 32: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

32

__switch_to( ) (7)-1 Terminates. The __switch_to( ) C function ends by means of the statement:

return prev_p;

The corresponding assembly language instructions generated by the compiler are:

movl %edi,%eax ret

The prev_p parameter (now in edi) is copied into eax, because by default the return value of any C function is passed in the eax register.

Notice that the value of eax is thus preserved across the invocation of __switch_to( ); this is quite important, because the invoking switch_to( ) macro assumes that eax always stores the address of the process descriptor being replaced.

Page 33: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

33

__switch_to( ) (7)-2 The ret assembly language instruction loads the eip

program counter with the return address stored on top of the stack.

However, the __switch_to( ) function has been invoked simply by jumping into it. Therefore, the ret instruction finds on the stack the address of the instruction labeled as 1, which was pushed by the switch_to macro.

If next_p was never suspended before because it is being executed for the first time, the function finds the starting address of the ret_from_fork( ) function. P.S.: see the section "The clone( ), fork( ), and vfork( )

System Calls" later in this chapter.

Page 34: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

34

Resume the Execution of a Process

Page 35: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

35

Here process A that was replaced by B gets the CPU again: it executes a few instructions that restore the contents of the eflags and ebp registers. The first of these two instructions is labeled as 1:

1: popl %ebp

popfl

switch_to (8)

Page 36: 1 Linux Operating System 許 富 皓. 2 Processes Switch.

36

Copies the content of the eax register (loaded in step 1 above) into the memory location identified by the third parameter last of the switch_to macro:

movl %eax, last

As discussed earlier, the eax register points to the descriptor of the process that has just been replaced.

switch_to (9)