Notes
-
Upload
cameroon45 -
Category
Documents
-
view
594 -
download
0
description
Transcript of Notes
![Page 1: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/1.jpg)
Virtual machines
Jinyang Li
![Page 2: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/2.jpg)
OS sits between h/w and app
OS
hardware
firefox iTunes emacs
syscall
h/w interface (intel manuals)
OS abstracts the h/w interface
![Page 3: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/3.jpg)
VMM virtualizes hardware interface
guest OS
hardware
firefox iTunes emacs
syscall
h/w interfaceguest OS
firefox iTunes emacs
syscall
h/w interface
h/w interface
Virtual machine monitor
![Page 4: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/4.jpg)
VMM hosted architecture
Host OS
hardware
app
syscall
h/w interface (intel manuals)
Guest OS
app app app
syscall
h/w interface
Virtual machine monitor
![Page 5: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/5.jpg)
History of virtualization
• Old idea dating from 1960s– IBM VM/370: a VMM for IBM mainframe– Multiplex multiple OS on expensive h/w– Desirable when few machines around
• Interest died out in the 80s and 90s– PC h/w is cheap
![Page 6: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/6.jpg)
Why VM today?
• Machine consolidation– N virtual machines 1 physical machine– E.g. Amazon’s EC2 cloud
• VM simplifies software management– Bundle OS/libraries/configurations together
• Other cool uses– Security, fault tolerance, debugging …
![Page 7: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/7.jpg)
Similarities of OS and VMM
• OS provides a virtual execution environment for processes
• VMM provides a virtual execution env (virtual hardware) for OSes
![Page 8: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/8.jpg)
Differences btw. virtualization for processes and OSes
• How does the process and OS use hardware resources?
process OS
CPU Non-privileged registers and instructions
+Privileged registers and instructions
memory Virtual memory +Physical memory
exceptions Signals, errors +Traps, interrupts
I/O File system Programmed I/O, DMA, interrupts
![Page 9: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/9.jpg)
Complete machine simulation#define REG_EAX 1;int32_t eip;int32_t regs[8];int32_t segregs[4];...for (;;) { read_instruction(); switch (decode_instruction_opcode()) { case OPCODE_ADD: int src = decode_src_reg(); int dst = decode_dst_reg(); regs[dst] = regs[dst] + regs[src]; break; case .. }
eip += instruction_length;}
![Page 10: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/10.jpg)
Pros/Cons of simulation
• Pros– Controlled execution– Great for debugging
• Cons: too slow– 100x slow down of CPU– The software decode+execution takes
100~1000s cycles to execute one instruction
![Page 11: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/11.jpg)
Virtualization’s goals
• Fidelity– Software on VMM executes identically to its
execution on h/w
• Performance– Majority of guest instructions are directly executed
by hardware
• Safety– VMM manages h/w resources, provides isolation
etc.
![Page 12: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/12.jpg)
Virtualization challenges
• Insight: execute most instructions as they are– ADD $1, %eax
• Challenges: – How to execute privileged instructions?
• lgdt, cli, halt
– How to virtualize the MMU?– How to prevent guest from overwriting host or
other guests?• mv $123, %cr3
– How to virtualize I/O?
![Page 13: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/13.jpg)
Basic CPU virtualization techniques
1. Trap-and-emulate• KVM, QEMU
2. Paravirtualization• Xen
3. Dynamic binary translation• VMWare
![Page 14: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/14.jpg)
Technique #1: trap-n-emulate
• “trap-n-emulate” (classical virtualization)– Run guest OS at “lesser” privilege– Privileged instructions cause “traps”– VMM run simulator on trapped instructions– (Most) non-privileged instructions do not need
traps– Need h/w support
![Page 15: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/15.jpg)
Technique #1: x86 challenges
• Traditional x86 is not amicable to #1
• Problems:– Many privilege instructions do not trap!
• popf does not trap if it cannot modify system flag
– Hardware-managed TLB• On TLB miss, h/w automatically loads from page
table (VMM cannot intercept this event)
![Page 16: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/16.jpg)
Technique #1: h/w support• AMD’s SVM and Intel’s VT extension to x86
– Starting in late 2005 – AMD Athlon 64, Intel P4, Intel Core …
• Many VMMs now utilize this h/w support– VMWare, QEMU, KVM, VirtualBox, …
• More than just simple fixes– I.e. make sure privileged instructions trap
• H/w support’s goal: minimize traps and emulation in VMM
![Page 17: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/17.jpg)
Technique #1: h/w support
OS
app app CPL=3
CPL=0
VMM
Guest OS
app app
Guest OS
Vmx non-root
Vmxroot
vmrun vmexit
![Page 18: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/18.jpg)
Technique #1: h/w support
• VMM sets up an in-memory VM control data structure (VMCS) per VM
• VMCS virtualizes– System registers:
• %CR0, %CR3, %EIP, %eflags, %CS, %SS, …
• VMCS allows VMM to specify exit controls:– E.g. whether to trap upon “HLT”, “LGDT”
instructions
• Effects: fewer traps
![Page 19: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/19.jpg)
Technique #2: paravirtualization
• Fancy word for “we have to modify and recompile OS”
• Popular back when x86 is not easily virtualizable
• VMM runs at privileged mode, VMs run unprivileged mode
• Modified OS to call into VMM for memory, I/O, interrupts setup, etc..– ~3000 LoC modifications for Linux, ~5000 LoC for
XP
![Page 20: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/20.jpg)
Technique #3: dynamic binary translation
• We have seen BT before. Where?– Eraser intercepts all memory reads/writes to check
for lock protection
• How BT enables software virtualization: – find all privileged instructions in OS and replace
them with call-ins to VMM for emulation
• Why not static binary translation?• Popularized by VMWare
– QEMU also supports BT
![Page 21: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/21.jpg)
Technique #3: binary translation
void clearbal() { while (balance>0) balance--;}
… 804836d: a1 8c 95 04 08 mov 0x804958c,%eax8048372: 83 e8 01 sub $0x1,%eax8048375: a3 8c 95 04 08 mov %eax,0x804958c804837a: a1 8c 95 04 08 mov 0x804958c,%eax804837f: 85 c0 test %eax,%eax8048381: 7f ea jg 804836d8048383: c3 ret…
translationengine
code cache
90d:mov…sub…mov…mov…test…jg call<TE_jmp>(804836d)call<TE_ret>
Original Cache804836d 90d… …
jg 90d
![Page 22: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/22.jpg)
Technique #3: binary translation
• Is BT applied on user-level programs?• BT performance
Most instructions can be executed identically Incur translation overhead only for the first time code
is executed Intercepting and emulating privileged instructions is
expensive• e.g. syscalls
BT slows down call/ret control flow
![Page 23: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/23.jpg)
Memory virtualization
PA=0 4G
OS1
PA=0 1G
OS2
PA=0 1G
MA=0
%cr3 pa Can h/w use this page table?
%cr3 ma
pa
pa
ma
ma
VMM gives the correspondingshadow page table to h/w
VA VA
VA
VA
VA
![Page 24: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/24.jpg)
Maintain shadow page tables
• Correctness requires: – A shadow pg table must be consistent with its actual pg table
• Strawman 1: – On switching address space (“mov %cr3 …”), construct a fresh
shadow pg table Incurs expensive addr space switch overhead
• Strawman 2:– On switching address space, use an empty shadow pg table– Upon incurring page faults, modify shadow PTE according to
actual PTE Incurs many hidden pg faults
![Page 25: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/25.jpg)
Maintain shadow page tables
• Can VMM cache shadows?– Challenge: what if OS modifies one of the pg tables
w/o knowledge of VMM?
• Insight: write protect actual pg tables.– Referred to as “memory traces”
• VMM may choose not to populate all shadow PTEs at once– saves addr space switch time– Less hidden pg faults than strawman #2 because
shadow PTEs are cached
![Page 26: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/26.jpg)
More h/w support
• Intel/AMD added h/w support for memory virtualization– e.g. Intel Core i7 (Q4 2008)– Add new table from PA to MA– h/w traverses two pg tables VAPA,
PAMA to fill TLB
![Page 27: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/27.jpg)
Virtualize I/O• OS communicates with I/O devices via
– Special instruction in/out– Memory mapping I/O (PIO)– Interrupts– DMA
• Virtualization– In/out and PIO must trap into VMM– Run simulation of I/O device
• Simulation:– Interrupt: Generate interrupt in CPU simulator– DMA: copy data to/fromt physical memory of VM
![Page 28: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/28.jpg)
Managing memory in VMM
• Configure VMs to use more “physical” memory than actually available
• What happens when running out of memory?• Strawman: use LRU paging at VMM
– OS already uses LRU doubling paging– OS will recycle whatever “physical page” VMM just
paged out– Better to do random eviction
![Page 29: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/29.jpg)
ESX: Reclaiming pages
• Idea: trick OS to return memory to VMM
• OS is better at deciding what to swap– Normally OS uses all available memory– E.g. buffer cache contains old pages, OS
won’t discard if it doesn’t need memory
• ESX trick: baloon driver
![Page 30: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/30.jpg)
baloon driver
VMM
OS1 OS2
Baloon is a special pseudo-device loaded into OS
VMM instructs baloon to inflate or deflate depending
on memory pressure
Baloon inflates by requesting lots of “pinned” memory pages
To accommodate inflated baloon,OS releases/swaps out
some of its memory pages
Baloon tells VMM to recycle its “private” pinned pages
![Page 31: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/31.jpg)
ESX: sharing pages across VMs
• Many VMs run same OS and programs– Many Linux boxes with Apache server
• Idea: use 1 machine page for identical physical pages• Periodically scan to find identical machine pages
– Do copy-on-write to eliminate redundancy
• Optimization: use a hash table keyed by hash(content) – Allows quick lookup based on page content
![Page 32: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/32.jpg)
Idle memory tax• Proportional share memory allocation
– Important VM gets more memory– Reclaim memory from VM with smallest “shares-to-
pages” (S/P) ratio– If SA = 2SB, A can have 2X memory as B
• Problem: – high-share VMs hoard more memory than needed
• Solution: idle memory tax– Instead of S/P, reclaim from VM w/ smallest
S/P(f+k(1-f))– Statistically sample to determine f
f: frac of non-idle pagesk≥1: a configurable
idle page “cost” parameter
![Page 33: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/33.jpg)
Summary: VMM attributes
• Software compatibility– Runs all software
• Low overhead– Near “raw” machine performance
• Complete isolation– Total data isolation between virtual machines
• Encapsulation– VMs are not tied to physical machines– Checkpoint/migration
![Page 34: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/34.jpg)
Example: VMM-based IDS
• Tradeoffs of intrusion detection systems (IDS):– Host-based IDS:
• Good visibility to detect intruder• Weak isolation from intruder disabling IDS
– Network-based IDS:• Good isolation from attacker• Weak visibility of what’s actually going on
• Can we have both visibility and isolation?
![Page 35: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/35.jpg)
Example: VMM-based IDS
• Strong isolation– VMM isolate software in VM from VMM– Compromised OS cannot disable IDS in VMM
• Introspection: peek inside at VM– Examine physical memory, registers, I/O
devices for patterns of break-ins
• Interposition: modify h/w abstraction to enhance security
![Page 36: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/36.jpg)
Compute Utility
• Virtual appliance abstraction– Target specialized environment (e.g. program
development)– Store targeted VMs in centralized repository– Cached on running machines
• Benefits:– Simplified system admin– Mobility: computing environment follows user
around
![Page 37: Notes](https://reader033.fdocuments.in/reader033/viewer/2022051609/547a4ed0b4af9f206f8b4733/html5/thumbnails/37.jpg)
Transparent replication
• Replicate VMs across multiple physical machines– If one fails, another can take over immediately
• No software modification necessary
• Preserves all active network connections