Quantifying and Improving I/O Predictability in Virtualized Systems Cheng Li, Inigo Goiri, Abhishek...
-
Upload
derek-melton -
Category
Documents
-
view
214 -
download
2
Transcript of Quantifying and Improving I/O Predictability in Virtualized Systems Cheng Li, Inigo Goiri, Abhishek...
1
Quantifying and Improving I/O Predictability in Virtualized Systems
Cheng Li, Inigo Goiri, Abhishek Bhattacharjee, Ricardo Bianchini, Thu D. Nguyen
2
Problem
• IaaS cloud providers (e.g., Amazon EC2)– Virtualization to consolidate virtual machines
• Performance may vary with consolidation– Interference and variable resource allocation– Inconsistent and unpredictable performance
VM
Physical Machine
VM VM
Physical Machine
VM VM VM
Physical Machine
3
Our solution• Virtualized systems with predictable performance– Consolidation should not affect throughput, response time
• Predictability is different than isolation– Assignment of resources to VMs must be fixed at all times
• New class of predictable-performance service
• This paper: storage I/O predictability in VirtualFence
4
Why?• Many users desire predictable performance– Streaming and gaming apps– Performance tuning, debugging, diagnosis– Proper app design (e.g., workflows, pipelines)
• Predictability benefits providers– Can charge for exactly the resources used– Direct relationship between resources and performance
• Predictability benefits users– Can implement apps that need predictability– Predictable cloud costs
5
Outline
• Motivation• Quantifying unpredictability• VirtualFence• Evaluation• Conclusions
6
How to measure unpredictability?
• Performance deviation– Relative change in performance– Stand-alone (PI) vs. co-located (PD)– Average throughput or average response time
%100
I
DI
P
PPDeviation
7
• Studied deviation across VMMs, storage devices, etc– I/O performance deviation is endemic
• Some main sources of deviation:– Resource allocation policy (e.g., work-conserving)– Device-specific characteristics (e.g., SSD erasure)
• More findings in Rutgers DCS-TR-697
Quantifying performance deviation
HDD, 4 VMs, Webserver SSD, 4 VMs, Webserver0%
100%
200%
300%
400%
500%
XenWSESXiKVM
Resp
onse
tim
e de
viati
on
Deviation is high even for SSD-based storage
8
Outline
• Motivation• Quantifying unpredictability• VirtualFence• Evaluation• Conclusions
9
VirtualFence
• Predictable-performance storage system for Xen
Dom0
Disk
VirtualFence
SchedulerVirtual Device Driver
VM
Kernel
VMM
SSDcache
10
VirtualFence techniques1. Non-work-conserving time partitioning– Each VM is assigned a fixed amount of I/O time– Avoids interleaving requests from multiple VMs
Stand-alone scenario
Co-located scenario
VM1
T1
` ` ` VM1 ` ` `
T2 T3 T4 T5 T6 T7 T8
…
VM1
T1
VM2 VM3 VM4 VM1 VM2 VM3 VM4
T2 T3 T4 T5 T6 T7 T8
…
11
2. Small SSD cache in front of the HDD– Targets the HDD seek at the beginning of each time slot
3. Non-work-conserving space partitioning– Fixed size SSD cache per VM– Guarantees each VM’s cache space share
Users can purchase multiple time and space slots!
VirtualFence techniques
VM 1VM 1 VM 2 VM 3 VM 4SSD cacheco-located:SSD cachestand-alone:
12
Outline
• Motivation• Quantifying unpredictability• VirtualFence• Evaluation• Conclusions
13
Experimental environment
• Aggressive consolidation (80% utilization)
• Filebench workloads – Webserver: read-only– Mailserver: mixed reads/sync writes
• Physical machine: 4-core Xeon, 1 SSD, 1 HDD (22ms)
VM1( 8%)
VM2 (24%) VM4 (24%)
VM3 (24%)
14
VirtualFence evaluation
• VirtualFence benefits• Contribution of each technique• Impact of the workload• Absolute performance and VirtualFence overheads• Performance vs. deviation tradeoff
• VirtualFence combines all three techniques• Approaches the deviation of SSD+TP at lower cost
15
System configurations running mailserver0%
100%
200%
300%
400%
500%
363% 439%
29% 10%288%
15%
HDD SSD HDD+TP SSD+TP Hybrid/Shared cache+TPVirtualFence
Resp
onse
tim
e de
viati
on
VirtualFence results
VirtualFenceSSD+TP
16
Impact of number of time slots• More slots decrease deviation, degrade performance• Ideal: fewest slots that allow enough consolidation
2 3 4 0%
10%
20%
30%
0
10
20
30
Response time deviation Response time
Number of slots under webserver
Resp
onse
tim
e de
viati
on
Resp
onse
tim
e (m
s)
17
Impact of time slot length• Longer slots decrease deviation, degrade performance• Ideal: shortest slot that produces enough predictability
10 15 20 40 0%
20%
40%
60%
0
20
40
60
Response time deviation Response time
Slot length (ms) under webserver
Resp
onse
tim
e de
viati
on
Resp
onse
tim
e (m
s)
18
Conclusions
• Consolidation leads to unpredictability• VirtualFence– Software/hardware solution– Improves I/O predictability significantly– Provider selects best predictability vs. performance tradeoff– User rents as many slots as needed for good performance
Quantifying and Improving I/O Predictability in Virtualized Systems
Cheng Li, Inigo Goiri, Abhishek Bhattacharjee, Ricardo Bianchini, Thu D. Nguyen
Q&A