Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on...

21
Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD Student in Computer Science 12 th July 2006

Transcript of Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on...

Page 1: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Operating Systems & Virtual Machines

Xen and the Art of Virtualization

PhD Program, Seminars on “Advances in Operating Systems”

Dott. Luca Veraldi, PhD Student in Computer Science

12th July 2006

Page 2: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Agenda• Why Virtualization

• challenges• Virtualization Concepts• Xen

• the overall picture• implementation details: MM, CPU, I/O, Exc/Int, Scheduling• administration tools• the cost of porting guest OS to Xen• performance evaluation

• VM Migration using Virtualization techniques• general concepts• implementation over Xen• related work

• References

Page 3: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Why Virtualization

• The OS is yet a virtualization layer by itself• it virtualizes FW resources for applications’ sake

• Key concepts in definition of OSs• modularity

• mantainability

• efficiency

• expandibility

• isolation?

• No sensibility for isolation matters in existing OS• We would like to isolate application domains

• drivers from the rest of the OS

• OS instances from one another

Firmware

OS

Applications

virtualized Firmware access

Page 4: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Why Virtualization - Challenges• Either a matter of security or performance

• isolating applications of mutually untrusting users• the problem of device drivers

• (untrusted) code running in privileged mode• potentially complete control over OS data structures• critical, complex and bug-prone

• server management activities cutdown• isolation provide for avoidance of unexpected configuration interactions between services

• List of desiderata:• allow multiple OS instances (domains or Virtual Machines)• isolated physical memory for each domain• restricted/verified privileged behavior• isolated devices• verified DMA accesses• performance isolation:

• execution of one domain may not affect performance of another one• generality

• support to binaries is fundamental• support for an as wide variety of existing OS as possible • unchanged guest OS?

• efficiency:• virtualization techniques may not introduce sensible overheads

• Some issues already targeted by µ/Exo kernels

Page 5: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Virtualization Concepts• A further virtualization layer in the middle between OS and FW: VMM• Allow for multiple concurrent OS instances

• modern PCs are powerful enough for creating the illusion of several OS virtual machines to run simultaneously

• Allow for OS migration

• Two different scenarios:• full virtualization

• there is a complete functional ordering between layers• full abstraction of machine (from BIOS to disks, DMA controllers, video…)• virtualization is fully transparent: guest OS unchanged• much more complex to design and implement

• VMWare overheads due to TLB shadow tables

Apps

OS

FW

OS OS

VMM

OS

Applications

Firmwarethe interface of FW is fully abstracted

Apps

OS

VMM

Apps

FW

Apps

Page 6: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Virtualization Concepts• A further virtualization layer in the middle between OS and FW: VMM• Allow for multiple concurrent OS instances

• modern PCs are powerful enough for creating the illusion of several OS virtual machines to run simultaneously

• Allow for OS migration

• Two different scenarios:• para- virtualization

• not a really hierarchical ordering between layers• virtualization is similar to FW interface but neither complete nor identical• guest OSs must be modified to become VM-aware

• there is a potential gain in performance, due to specialization of kernel code• easier to design

• but carefully think about interfaces

VMM

OS

Applications

Firmware

this interface is much more critical, now

Apps

OS

FW

OS OS

Apps

OS

VMM

Apps

FW

Apps

Page 7: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Xen – The overall picture• Para virtualized approach

• non-pure hierarchy between OS and VMM• we export both real FW and intermediate VMM abstractions to above layers

• speed up performance, reduce the levels of interpretation• Isolate VMM layer from OS

• use protection levels for ASM instructions (a.k.a. Intel protection rings)• ring[0,3]: applications at ring-3, OS at ring-0• modify guest OS to run in less privileged ring-1• privileged operations performed by VMM in ring-0• if no enough rings in FW, run OS within the same protection ring as applications

App Ring 3

Ring 1

Ring 0

System calls,Signals,Events

Scheduling ofprocesses Hyper calls,

Events

Scheduling of Virtual Machines

App App App App

OS

VMM

OS

Page 8: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Xen – Implementation details

• Memory Management virtualization• most critical aspect, huge intervention on guest OSs

• much more difficult due to x86 architecture• TLB faults handled directly at FW level

• Process Relocation Tables must be available at the FW level

…OS

FW

MMU

P0 RelocTable P1 RelocTable

CPU

RAM

?

• guest OS continues to manage its own relocation tables

• relocation tables need to be verified within Xen at creation time

• they remain read-only for OS

• Xen resides at the topmost entries • which are reserved and not used by OS

• to avoid TLB flushes on hypercalls

Page 9: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Xen – Implementation details• Process creation

• guest OS requires new relocation table to Xen• relocation tables are augmented to include Xen mapped pages• Xen registers the new relocation table and acquires exclusive write access• all updates from OS will cause page-faults, in order for Xen to verify the

update request

OS

FW

MMU

P0 RelocTable P1 RelocTable

CPU

RAM

VMM

protectionverification

protection fault

Page 10: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Xen – Implementation details• CPU virtualization

• change in protection ring of guest OS (01)• privileged instructionskernel

• replace direct privileged operations within OS kernel by hypercalls to Xen• scheduling of virtual machine shall be pretty efficient

• many applications depend on timing (TCP/IP rtt, real-time services, …)

• Borrowed Virtual Time scheduling algorithm• address low-latency contraint• grant efficient dispatch even for real-time contexts• notion of virtual-time• possibility to borrow the virtual time and get dispatch preference• general-purpose algorithm, not specialized upon complex real-time

paradigms• usefull to address the problem of virtualization overheads in scheduling

• The notion of system time• guest OSs are provided with real-time and virtual-time• timers are dispathed to guest OSs by means of events

Page 11: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Xen – Implementation details• I/O device virtualization

• Xen addresses two critical issues• define a simplifies interface for access to I/O• isolate drivers within their own virtual machine

• A simplified interface for I/O• not a novel issue• but always claimed disasters

• interface unioning instead of top-down semplification• a political matter, more than technicalities• device firmware continuously evolving over time• flexibility, extensibility

• Xen proposes one approach, validating it through experiments

• all data transfers are passed through and verified by Xen• a potential performance issue• but…

Page 12: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Xen – Implementation details• Implementing zero-copy message passing

• for high-performance data exchange among layers• shared, circular communication channels• out-of-band data buffers• shared memory pages among guest OS and Xen• pinning of physical memory pages for DMA• owership exchange upon data receipt (network, disk)

VMM

OS

pinned memory pages withing guest OS

shared, circular communication channel

descriptor of buffer

FW

protectionverification

DMA bypasses

VMM

page ownership exchange

Page 13: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Xen – Implementation details

• Exceptions and Interrupts virtualization• a matter of translation

• for page faults, more tricky: faulting virtual address in privileged register CR2 at ring 0

OS

FW

MMU

P0 RelocTable P1 RelocTable

CPU

RAM

?

VMM

bitp = 0

read CR2, save to known

location

register CR2 carries

faulting addr

page fault

jump to handler code within guest OS

• handler table registered within FW

• the table refers to Xen code

• the privileged Xen code will read the content of the CR2 register and copy it at a known location within the guest OS

• (the one responsible for the faulting virtual address space)

• eventually, the Xen code simply jumps to the guest OS handler

Page 14: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Xen – Implementation details• Exceptions and Interrupts virtualization

• two kinds of exceptions most frequently issued• page faults• kernel traps (software exceptions)

• A performance risk• the first one, necessarily requires Xen mediation• for the second one, maybe it can be skipped

• directly register guest OS exception handler table • prior to security validation by Xen, at starting time • only for those entries

• tables swapped on every virtual machine schedule

• Interrupts are more critical• data transfer through shared channels

• both directly in pinned pages within guest OS• or through ownership exchange

• no de-multiplexing in FW• validation by Xen• a matter of translation

• from FW interrupts to Xen HP events dispatched to guest OS

FW

VMM

OS

hypercall to Xen

validation (no privileged op claimed by code) modification of handler for page-fault

registration within FW

Page 15: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Xen – Administration tools• Xen layer just performs control and protection• Policies are left to the above layers

• exactly as it is in traditional OS design• separation of mechanisms and policies

• crucial mechanisms in µ/Exo Kernel• policies in user-space processes/library functions

• Management and administration issues• same as in traditional OS

• memory sharing among VM• through physical memory partitioning to enforce strong isolation

• scheduling parameters• to control dispatching of VM and weighted sharing of CPU time

• creation of new VM• virtual network interfaces• virtual block devices

• Several application-level tools to ease VMM management

Page 16: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Xen – The cost of porting guest OS• Para virtualized approach requires modifications

• priviledged operations replaced by hypercalls to Xen• device drivers and unified device interface

• The Linux case• somewhat modular structure of sources

• the case of three level relocation tables• a circumscribed intervention• Xenolinux

• The Windows XP case• a big mess• huge replication of code and function• not yet completed• really monolithical approach eventually hurts

• NumbersWhat Linux Windows XP

Architecture-independent 78 1299

Device drivers 1554 -----

Other (MM…) 1363 3321

Portion of the whole 1.36% -----

Page 17: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Xen – Performance evaluation• We have to evaluate

• Xen vs. other virtualization solutions• VMWare et similia (but benchmarking restriction…)

• we can only say that “Xen sensibly outperforms VMWare”

• User-Mode Linux

• guest OS over Xen vs. native OS• multiplexing of VM within Xen• performance isolation

• with synthetical antisocial processes running aside of web servers

Page 18: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Migration technology: how Virtualization can help• Typical problem from data centers/cluster administrators

• not HP solution• just server management issue

• What to migrate… a process or an OS?• an entire OS (a Virtual Machine within Xen) is easier

• extremely simple interface OSVMM• less prone to residual dependencies

• critical issue if migrating for maintainance• preserve OS-related abstractions

• network connections• open files

• do not care about application-dependent approaches

• Three phases• pushing

• let’s copy address space pages to target machine• the source entity is still computing

• stop, final copying, restart• suspend source entity• just re-transfer dirty pages

• pulling• use page fault handler to obtain missing pages• network connection to source on demand• residual dependencies

Page 19: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Migration technology: how Virtualization can help

• Implementation over Xen• a distributed file system• minimize downtime of the system

• migrate while still computing: push+stop phases• no pulling phase, cannot allow residual dependencies for management

activities on source• the concept of Writeable Working Set (WWS)• two different solution

• at Xen level, managed migration• at OS level, self migration

• interesting performance• 0.2 sec to migrate SPECWeb benchmark

VMM copy daemonOS

ApplicationsOS

Applications

OS

Applications

VMMstubOS

Applications

OS

Applications

network

Page 20: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

Migration technology: how Virtualization can help

• Writeable Working Set (WWS)• extension based on traditional Working Set in Oss• a (possibly large) set of pages will seldom or never be

modified any more• usefull to estimate the downtime of the VM

• pages WWS will contribute to the overhead of the stop-copy phase

• Statistics about dirtying speed during each transfer phase• only transfer those dirty pages that were not dirty at previous

round• take care of the usual (small) amount of pages that will

always be dirtied• Stack

• Incremental network bandwidth utilization

Page 21: Operating Systems & Virtual Machines Xen and the Art of Virtualization PhD Program, Seminars on “Advances in Operating Systems” Dott. Luca Veraldi, PhD.

References• T.E. Anderson

The case for Application-specific Operating SystemsIn Third Workshop on Workstation Operating Systems, pages 92-94, 1992

• Dawson R. Engler, M. Frans Kaashoek, James O’Toole Jr.Exokernel: An Operating System Architecture for Application-Level Resource ManagementIn Proceedings of the 15th ACM Symposium on Operating Systems Principles, pages 251-266, 1995

• P. Barham, B. Dragovic, K. Fraser, S. Hand, T, Harris, A. Ho, R. Neugebauer , I. Pratt, A. WarfieldXen and the Art of VirtualizationIn Proceedings of the ACM Symposium on Operating Systems Principles, 2003

• Keir Fraser, Steven Hand, Rolf Neugebauer, Ian Pratt, Andrew Warfield, Mark WilliamsonSafe Hardware Access with the Xen Virtual Machine MonitorIn Proceedings of the 1st Workshop on Operating System and Architectural Support for On-Demand IT Infrastructure, 2004

• K. J, Duda, D.R. CheritonBorrowed Virtual Time (BVT) Scheduling: Supporting latency-sensitive Threads in a general-purpose SchedulerIn Proceedings of the 17th ACM SIGOPS Symposium on Operating System Principles, pages 261-276, 1999

• T. Abels, P. Dhawan, B. ChandrasekaranAn Overview of Xen Virtualization

• C. Clark, K. Fraser, S. Hand, J. Gorm Hanseny, E. July, C. Limpach, I. Pratt, A. WarfieldLive Migration of Virtual MachinesIn Proceedings of the ACM/USENIX Symposium on Networked Systems Design and Implementation, 2005