What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system,...

56
Virtualization 1 Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Transcript of What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system,...

Page 1: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Virtualization

1

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 2: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

What is virtualization?

2

Creating a virtual version of somethingo Hardware, operating system, application, network, memory, storage

“The construction of an isomorphism between a guest system and a host” [Popek, Goldberg, ’74]

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 3: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

3

Example: virtual disk

Partition a single hard disk to multiple virtual diskso Virtual disk has virtual tracks & sectors

Implement virtual disk by file

Map between virtual disk and real disk contents

Virtual disk write/read mapped to file write/read in host system

Page 4: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

What is virtualization? (continued)

4

A way to run multiple operating systems (and their applications) on the same hardware (virtual machines)

Only virtual machine monitor (a.k.a. hypervisor) has full system control

Virtual machines completely isolated from each other (or so we hope)

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 5: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Basic concepts

Virtual Machine (VM)

Host

Guest

Hypervisor (type ||) / Virtual Machine Monitor

5Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 6: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Basic concepts

Virtual Machine (VM)

Host

Guest

Hypervisor (type ||) / Virtual Machine Monitor

6Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 7: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Basic concepts

Virtual Machine (VM)

Host

Guest

Hypervisor (type ||) / Virtual Machine Monitor

7Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 8: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Basic concepts

Virtual Machine (VM)

Host

Guest

Hypervisor (type ||) / Virtual Machine Monitor

8Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 9: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Types of virtualization

9

Full virtualization – guest OS runs unmodified

Para-virtualization – guest OS must be aware of virtualization, source-code modifications required

Hardware virtualization support may be used for both

Our focus is on full virtualization

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 10: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Virtualization advantages

10

Cost-effectiveness – less hardwareo Multiple virtual machines / operating systems /

services on single physical machine (server consolidation)

o Various forms of computation as a service

Isolationo Good for security

o Great for reliability and recovery: If VM crashes it can be rebooted, does not affect other services (fault containment)

o VM migration

Development toolo Work on multiple OS in parallel

o Develop and debug OS in user mode

o Origins of VMware as a tool for developers

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 11: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Virtualization vs. Multi-Processing

11

HW (disk, NIC,…)

OS

Process1 Process2 ∙∙∙

Multi-processing

User space/ kernel separation

HW interface

Virtualization Real HW interface

HW (disk, NIC,…)

VMM/Hypervisor

Pr1 Pr2 ∙∙∙

OS1 OS2 ∙∙∙

Pr1 Pr2

Virtual HW interface

VM

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 12: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Type 1 and type 2 hypervisors

12Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Figure 7-1. Location of type 1 and type 2 hypervisors.

VMware ESX, Microsoft Hyper-V, Xen VMware Workstation, Microsoft Virtual

PC, Sun VirtualBox, QEMU, KVM

Page 13: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Type 1 and type 2 hypervisors (continued)

13Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Figure 7-2. Examples of the various combinations of virtualization type and hypervisor. Type 1 hypervisors

always run on the bare metal whereas type 2 hypervisors use the services of an existing host

operating system.

Page 14: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

What's required of a (classic) hypervisor

14Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Hypervisor should provide the following:

Safety: have full control of virtualized resources

Fidelity: program behavior on VM should be identical to its behavior on bare hardware

Efficiency: As much as possible, run directly on hardware without hypervisor intervention Full interpretation isn't efficient

Page 15: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Emulation is the process of implementing the functionality/interface

of one system on a system having different functionality/interface

Classic virtualization: trap and emulate

15

HW

VMM

VM1 VM2

Trap (1) Interrupt handler (2)

HW emulation

Return to process (3)

Page 16: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Trap and emulate: difficulties on x86

16Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Sensitive instructions: Provide control over HW resources behave differently in kernel/supervisor and user modes I/O instructions, enable/disable interrupts, access CR3 register…

Privileged instructions: cause a trap if executed in user mode

Theorem [Popek and Goldberg, 1974]

A machine can be virtualized [using trap and emulate]

if every sensitive instruction is privileged.

Not supported by x86 processors prior to 2005In 2005, Intel/AMD introduced virtualization HW support.

Page 17: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

What is sensitive?

CPU – some registers

MMU

o Page table

o Segments

Interrupts

Timers

I/O devices

17Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 18: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

X86 virtualization problem I

The x86 architecture (w/o virtualization extensions) can't be virtualized by trap and emulate.

Some sensitive instructions are not privileged.

Example: the popf instructiono Pops 16 bits from stack to flags register

o One of the flags masks (i.e. disables) interrupts

o The instruction is not privileged

o What happens if the OS of a VM runs popf?

18Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 19: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Some instructions: push, pop, mov can have code segment selectors (cs, ds, ss) as arguments even in user mode, so they can be read

The selectors have two bits that are their current privilege levelo In x86 (beginning with 386), four privilege levels (ring 0 to ring 3)

o The two lower bits of the cs register are the Current Privilege Level (CPL) of the code.

o Guest OS thinks that it is in ring 0.

o Guest OS is actually in ring 1

Result - guest OS confusion.

19

X86 virtualization problem II

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 20: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Implementation options

Avoid executing sensitive instructionso Interpretation (BOCHS, JSLinux).

o Binary translation – change executed code (VMware, QEMU).

Para-virtualization – re-compile guest OS (XEN, Denali).

Hardware assistance – Intel VT-x and AMD-V (used by KVM, XEN, Vmware).

20Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 21: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

21

Concepts, classical CPU virtualization

o Binary translation

Memory virtualization

Outline

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 22: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Binary translation

Binary translation is the process of translating one instruction set to another one.

Approach I: translate entire OS when loaded to VM

o Key problem – indirect control flow

22Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 23: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Dynamic binary translation

Approach II: translate code on the fly

Simplest approacho Keep table mapping old instructions to new instructions.

o Fetch old instruction.

o Use table to translate.

o Execute new instruction(s).

Problem: performanceo Overhead for every instruction similarly to interpretation.

23Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 24: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Dynamic BT with caching

Cache translated code region:o After translation run from cache.

o Translation occurs only once.

Static translation cannot handle dynamic control transfer, when:o Jump depending on content of memory address.

o Indirect function call (by function pointer).

Translation of dynamic control transfer must be done at execution time.

User code does not have to be translated

24Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 25: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

25

Virtualization prior to HW support

Figure 7-4. The binary translation rewrites the guest operating system running in ring 1, while the hypervisor

runs in ring 0

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 26: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

VMWare binary translation: example

26

C code 64-bit binary

Binary (hex)

representation

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 27: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

VMWare binary translation: example

27

Translator reads guest memory at the address indicated by guest PC

Decodes instructions, creates Intermediate Representation - IR objects

Accumulates IR objects to translation units (TUs)o Basic blocks (BB), stops upon control flow

First TU Compiled code fragment (CCF)

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 28: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

VMWare binary translation: example

Translator reads guest memory at the address indicated by guest PC

Decodes instructions, creates Intermediate Representation - IR objects

Accumulates IR objects to translation units (TUs)o Basic blocks (BB), stops upon control flow

28

First TU

Identical

code

Compiled code fragment (CCF)

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 29: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

VMWare binary translation: example

Translator reads guest memory at the address indicated by guest PC

Decodes instructions, creates Intermediate Representation - IR objects

Accumulates IR objects to translation units (TUs)o Basic blocks (BB), stops upon control flow

29

First TU

Translation of

jump BBCompiled code fragment (CCF)

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 30: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

VMWare binary translation: example

Translator reads guest memory at the address indicated by guest PC

Parses instructions, creates Intermediate Representation - IR objects

Accumulates IR objects to translation units (TUs)o Basic blocks (BB), stops upon control flow

30

First TU

Translation of

fall through BBCompiled code fragment (CCF)

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 31: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

VMWare binary translation: example

31

C code 64-bit binary

Which basic block will be translated next?

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 32: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

VMWare binary translation: example

32

C code 64-bit binary

Which basic block will be translated next?

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 33: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

VMWare binary translation: example

33

C code 64-bit binary

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 34: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

VMWare binary translation example: output

34Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 35: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

VMWare binary translation example: output

35

These continuations remain because

respective basic blocks were not executed

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 36: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

VMWare binary translation operation

36

Translation cache (TC) stores translations done so far

A hash table tracks the input-to-output correspondence

Chaining optimization allows one CCF to jump directly to another without calling out of the translation cache

As TC gradually captures guest's working set, proportion of translation decreases

User code does not have to be translated

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 37: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Dealing with privileged instructions: example

37

The cli (clear interrupts) instruction is privileged

Translated to: “vcpu.flags.IP=0”

Much faster than source binary!

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky

Page 38: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

38

Concepts, classical CPU virtualization

o Binary translation

Memory virtualization

Outline

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 39: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

39

Memory allocation

Each VM usually receives a contiguous set of physical addresses.

o 1 Gbyte– 4 Gbyte are typical values.

As far as VM is concerned, this is the physical memory of the machine.

The guest OS allocates pages to guest processes.

Page 40: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

40

Memory management

Assumptions of OS in VM:o Physical memory is a contiguous block of addresses from 0 to

some n.

o OS can map any virtual page to any page frame.

Hypervisor must:o Partition memory among VMs.

o Ensure virtual page mapping only to assigned page frames.

TLB miss: cache miss in HW-managed TLB (e.g. x86) causes HW to select a page from page table.

VM OS must not manage real page table.

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 41: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

41

Option 1: brute force

HW

TLBCR3

Guest OS

Page dir.

Page table

Hypervisor

VM memory layout

Define these pages as not R/W

CPU

Interrupt & VMM corrects address.

VMM SW

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 42: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

42

Brute force – description

Guest page tables are read and write protected in host system.

If guest OS reads page table (e.g. for page eviction), writes page table (e.g. after page fault), or changes CR3, the system traps.

The hypervisor then uses a VM memory layout to: Return answers to VM

Update the layout

Hypervisor switches VM memory layout when new VM is scheduled.

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 43: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

43

Option 2: shadow page tables

HW

TLBCR3

Guest OS

Page dir.

Page table

Hypervisor

Shadow page table

CPU

Interrupt & VMM corrects page table.

VMM SW

G-CR3

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 44: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

44

Shadow page tables – description

Hypervisor maintains “shadow page tables”.

Guest page tables map: Guest VA (GVA) Guest PA (GPA)

Shadow tables map: Guest VA Host PA (HPA).

Hypervisor does not trap guest updates to its page table.o Result – inconsistent guest page table and shadow page table.

When guest process accesses virtual addresso The physical address is not in the guest page table, but in the

shadow page table.

o HW translates correctly, because it is aware only of shadow tables.

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 45: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

45

Shadow page tables – description (continued)

If address in TLB – TLB hit and no problem.

When guest process causes a page faulto Hypervisor begins execution.

o If required, hypervisor updates shadow page table.

Performance is as good as native execution as long as there are no page faults.

Shadow page tables should be cached so that once a VM is re-scheduled the page table does not have to be rebuilt from scratch.

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 46: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

46

Shadow page tables – page faults (continued)

Two scenarios when handling a page fault. Hypervisor ``walks’’ guest page table to determine which it is.

1. Guest page fault – No translation in guest page tables ``inject’’ page fault for guest to handle

2. Guest translation found update shadow table respectively

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 47: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

47

GuestPage Table

ShadowPage Table

GuestPage Table

GuestPage Table

ShadowPage Table

ShadowPage Table

Virtual CR3

Real CR3

Shadow page tables – updating CR3

Slide taken from a presentation by VMWare.

Page 48: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

48

Shadow page tables – updating CR3

GuestPage Table

ShadowPage Table

GuestPage Table

GuestPage Table

ShadowPage Table

ShadowPage Table

Virtual CR3

Real CR3

Slide taken from a presentation by VMWare.

Page 49: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

49

Shadow page tables – updating CR3

GuestPage Table

ShadowPage Table

GuestPage Table

GuestPage Table

ShadowPage Table

ShadowPage Table

Virtual CR3

Real CR3

Slide taken from a presentation by VMWare.

Page 50: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

50

Undiscovered guest page table

GuestPage Table

ShadowPage Table

GuestPage Table

GuestPage Table

ShadowPage Table

ShadowPage Table

Virtual CR3

Real CR3

GuestPage Table

Slide taken from a presentation by VMWare.

Page 51: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

51

Undiscovered guest page table

GuestPage Table

ShadowPage Table

GuestPage Table

GuestPage Table

ShadowPage Table

ShadowPage Table

Virtual CR3

Real CR3

GuestPage Table

ShadowPage Table

Slide taken from a presentation by VMWare.

Page 52: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

52

Option 3: Extended/nested page tables

HW

TLBCR3

Guest OS

Page dir.

Page table

Hypervisor

CPU

VMM SW

EPTP

Host page table

Host page table

Host page table

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 53: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

53

Nested/extended page tables - description

The name implies having page tables within page tables.

The essence of the idea is a hardware assist.o Hardware has an extra pointer and the ability to walk an extra set

of page tables.

o Idea is called Extended Page Tables (EPT) by Intel

Guest page tables hold Guest VA Guest PA mapping, access by standard CR3

Extended page tables hold Host VA Host PA mapping, access by EPTP (EPT pointer).

Host VA=Guest PA

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 54: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

54

Walking extended page tables

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 55: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

55

Extended page tables – description (cont'd)

TLB as usual holds Guest VA Host PA

On memory accesso If found in TLB – no problem.

o If not in TLB, but no page fault, hardware walks both tables andupdates TLB.

o If page fault, then hypervisor gets host virtual page (guest physical page) and maps it to host physical page.

Operating Systems, Spring 2018, I. Dinur, D. Hendler and R. Iakobashvili

Page 56: What is virtualization?os202/wiki.files/OS20_virtualization.pdf · o Hardware, operating system, application, network, memory, storage “The construction of an isomorphism between

Sources

56

“Modern operating systems”, 4‘th edition, A. Tanenbaum and H. Bos

“Virtual machines”, J. E. Smith and R. Nair

A presentation by Niv Gilboa from CSE@BGU

“Formal requirements for virtualizable third generation architectures”, G. J. Popek and R. P. Goldberg, CACM, 1974

“A comparison of software and hardware techniques for x86 virtualization”, K. Adams and O. Ageson, ASPLOS 2006

A presentation by VMWare

Operating Systems 2020, I. Dinur , D. Hendler and M. Kogan-Sadetsky