Post on 20-Jul-2020
Advanced Computing and Information Systems laboratory
Experimental Study of Virtual Machine Migration in Support of Reservation of
Cluster Resources
Ming Zhao, Renato J. Figueiredo
Advanced Computing and Information Systems (ACIS)Electrical and Computer Engineering
University of Florida{ming, renato}@acis.ufl.edu
2Ming Zhao, VTDC’07
Motivating Application
Dynamic Data-driven Brain-machine InterfaceResource demanding parallel applicationTime-critical during online experiments
Must finish each closed loop within 100msNo stringent time requirement for offline study
• • • BMIModel
K
In VivoData Acquisition
BMIModel
1
Parallel Computing on Virtualized Resources
Dynamic Data-driven Brain-machine Interfaces(http://www.acis.ufl.edu/dddasbmi)
Online Experimenting Offline Study
Closed-LoopControl
DataStorage
BMIModel
2
Braininstitute
ACIS
3Ming Zhao, VTDC’07
Motivating Application
Motivation for VM-based resource reservationOnline experiments
Dedicate cluster resources for VMs running BMI modelsWithout reservation, deadline is often violated
Offline studyShare cluster resources with VMs running other workloads
Cluster reservationMigrating exiting VMs to other resources
100ms
Data pollingData
transfer
• • • BMIModel
K
In VivoData Acquisition
BMIModel
1
Parallel Computing on Virtualized Resources
Dynamic Data-driven Brain-machine Interfaces(http://www.acis.ufl.edu/dddasbmi)
Online Experimenting Offline Study
Closed-LoopControl
DataStorage
BMIModel
2
Computing
Robotcontrol
Latencies
4Ming Zhao, VTDC’07
Overview
Goal:Efficient VM-based cluster resource reservation
Challenge:Vacate resources in time (to meet schedule) but not too early (to fully utilize resources)
Solution:Build a migration overhead model through experimental analysis
5Ming Zhao, VTDC’07
Outline
Architecture
Experimental analysisMethodologyMigrating a single VMMigrating a sequence of VMsMigrating VMs in parallel
Summary
6Ming Zhao, VTDC’07
VM Scheduler
P1
Virtual ResourceManager
JobManager
1) Resource request:Need 1GHz CPU, 1GB RAM at 10am
2) Admission control: OK
4) VM instantiationVM1
VM2VM3
VM-based Resource ReservationVirtual resource manager
Global management of virtualized resourcesPresents an abstracted interface for resource requestsHides the complexity of using virtualized resourcesCoordinates with VM schedulers to reserve, allocate resources with VMs
Virtual machine schedulerPer-host management of VMsPresents a unified interface for managing VMsHides the complexity of using different VM software
6) Resource handlers:Machine IP: 10.5.156.30
3) Reservation:Create a VM with ... by 10am
7Ming Zhao, VTDC’07
VM Migration based Cluster Vacating
VM1 VM2 VM3
VM4 VM5
VM6 VM7
VM8 VM9
VM
VM
P1
P2
P3
VM
(I) P1 is shared by VMs running
various workloads
VM1 VM2 VM3
VM VM4 VM5
VM VM6 VM7
VM8 VM9VM
(II) P1 is reserved by migrating the existing
VMs to P2 and P3
P1
P2
P3
VM
VM1 VM2 VM3
VM VM4 VM5
VM VM6 VM7
VM8 VM9
VM
VM
VM
VM
(III) P1 is dedicated to the VMs created for a parallel application
P1
P2
P3
Example
8Ming Zhao, VTDC’07
Experimental SetupPhysical cluster
Each node has four CPUs (2.33GHz), 4GB memoryGigabit Ethernet
Virtual machinesVMware Server basedVM disk
Shared read-only image on NFS (.vmdk)Independent changes (redo logs) on local disks (.REDO)
VM memoryMapped to file (.vmem)Stored on local disks
VM1 VM2
vmxREDOvmem
vmxREDOvmem
Local
vmdk
NFS
Node 1
Local
Node 2
9Ming Zhao, VTDC’07
Experimental SetupMigration strategy
SuspendSynchronizes memory state to file
CopyUses FTP to transfer memory state file, disk redo filesMaximum throughput is about 100MB/s
ResumeRestores memory state from file in foreground
Not based on VMware solutions (VMotion)
VM1 VM2
vmxREDOvmem
vmxREDOvmem
Local
vmdk
NFS
Node 1
Local
Node 2
10Ming Zhao, VTDC’07
Migration Overhead of a Single VMDifferent memory sizes
Suspend and resume phases are fastCopy time grows as memory size increases
Different methods to store the migrated statesRAMFS removes the impact of disk I/Os to the migration process
Using disks on the destination host to store migrated VM state files
Using RAMFS on the destination host to store migrated VM state files
0
2
4
6
8
10
12
14
16
18
suspend copy resume
tim
e (s)
128MB
256MB
384MB
512MB
640MB
768MB
896MB
1024MB
0
2
4
6
8
10
12
14
16
18
suspend copy resume
tim
e (s)
128MB
256MB
384MB
512MB
640MB
768MB
896MB
1024MB
11Ming Zhao, VTDC’07
Modeling Copy PhaseUsing regression methods
Polynomial for using disk, linear for using RAMFS
RAMFS setup is used for all the following experiments
Using regression to model the copy phase
Using disks
y = 1E-05x2
+ 0.0043x + 1.2704
R2
= 0.9967
Using RAMFS
y = 0.0089x + 0.7502
R2
= 1
0
2
4
6
8
10
12
14
16
18
0 200 400 600 800 1000 1200
memory size (MB)
tim
e (s)
12Ming Zhao, VTDC’07
Modeling Suspend and Resume Phases
ResumeTypically short: memory state is buffered after the copy phase
SuspendTypically short: memory state is frequently synchronized to file
0
2
4
6
8
10
12
14
16
18
20
22
suspend resume
tim
e (s)
baseline
256MB
512MB
768MB
1024MB
13Ming Zhao, VTDC’07
Modeling the Suspend and Resume Phases
Running memory-intensive workload in the migrated VMA “rogue” program that continuously modifies memorySuspend time increases as more synchronization is needed
0
2
4
6
8
10
12
14
16
18
20
22
suspend resume
tim
e (s)
baseline
256MB
512MB
768MB
1024MB
14Ming Zhao, VTDC’07
Migrating a Sequence of VMs
Per-VM migration time is consistent regardless of the number of sequentially migrated VMs
Per-VM migration time when migrating a sequence of 256MB-RAM VMs
Per-VM migration time when migrating a sequence of 512MB-RAM VMs
0
1
2
3
4
5
6
7
8
9
10
suspend copy resume total
tim
e (s)
single VM
two VMs
four VMs
eight VMs
0
1
2
3
4
5
6
7
8
9
10
suspend copy resume total
tim
e (s)
single VM
two VMs
four VMs
15Ming Zhao, VTDC’07
Migrating VMs with CPU-intensive Workload
CPU-intensive workloadRuns iteratively, consumes 100% of CPU
Performance degradation = Impacted iteration time – Regular iteration time
Takes longer for applications in VMs to recover to full performance
Migrating four VMs in sequence; each has 512MB-RAM and runs a CPU-intensive benchmark
0
2
4
6
8
10
12
14
16
18
20
Time (s)
Ite
ra
tio
n T
ime
(s)
VM1
VM2
VM3
VM4
0
2
4
6
8
10
12
14
16
18
20
VM1 VM2 VM3 VM4
Mig
ra
tio
n d
ela
y (s)
Performance degradation VM migration time
16Ming Zhao, VTDC’07
Migrating VMs with Memory-intensive Workload
Memory-intensive workloadRuns iteratively, touches 100% of memory
Performance degradation = Time of 2 impacted iterations – Regular iteration time * 2
Migration is a more memory intensive process
Migrating four VMs in sequence; each has 512MB-RAM and runs a memory-intensive benchmark
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
VM1 VM2 VM3 VM4
Mig
ra
tio
n d
ela
y (s)
Performance degradation VM migration time
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
Time (s)
Ite
ra
tio
n tim
e (s)
VM1
VM2
VM3
VM4
c
17Ming Zhao, VTDC’07
Migrating VMs with Web WorkloadsApache Web server
Serves constant-rate requests from outside clients (100 request/s each)Takes longer for the throughputs to recover
Migrating four VMs in sequence; each has 512MB-RAM and runs a Web server
VM1
0
50
100
150
200
250
Reply
rate
VM2
0
50
100
150
200
250
Reply
rate
VM3
0
50
100
150
200
250
Reply
rate
VM4
0
50
100
150
200
250
Reply
rate
Aggregate
250
300
350
400
450
500
550
0 10 20 30 40 50 60 70
Time (s)
Reply
rate
18Ming Zhao, VTDC’07
0
10
20
30
40
50
60
1 2 4 8
Number of VMs
Mig
ra
tio
n tim
e (s)
256M-P
256M-S
0
10
20
30
40
50
60
1 2 4 8
Number of VMs
Mig
ra
tio
n tim
e (s)
256M-P
256M-S
512M-P
512M-S
Migrating VMs in ParallelParallel migration has shorter total migration time
Faster vacating of cluster resourcesSequential migration has shorter per-VM migration time
Less impact on the applications in the VMs
Total migration time
0
5
10
15
20
25
30
1 2 4 8
Number of VMs
Mig
ra
tio
n tim
e (s)
256M-P
256M-S
512M-P
512M-S
Per-VM migration time
19Ming Zhao, VTDC’07
Related WorkVM-based resource management
In-VIGOVirtual workspacesVirtuosoShirako
VM migration based resource reallocationVIOLINSandpiper
VM migration optimizationsVMware VMotionXen live migration
20Ming Zhao, VTDC’07
Summary
Conclusions:It is feasible to predict the migration time of VMs with different configurations and different numbersIt takes longer for a VM’s application to recover than the VM migration timeSequential migration has less impact on VMs’ applicationsParallel migration is faster for migrating multiple VMs
Future work:Generalizes and validates the model across different physical platforms, with different VM technologiesConsiders modeling of live migrations
21Ming Zhao, VTDC’07
Acknowledgement
DDDASBMI project teamhttp://www.acis.ufl.edu/dddasbmi
Sponsorship from NSF
Publication approval from VMware
Questions?ming@acis.ufl.eduhttp://www.acis.ufl.edu/~ming