PARTIES : QOS-AWARE RESOURCE PARTITIONING FOR … · Application Memcached Xapian NGINX Moses...
Transcript of PARTIES : QOS-AWARE RESOURCE PARTITIONING FOR … · Application Memcached Xapian NGINX Moses...
-
PARTIES: QOS-AWARE RESOURCE PARTITIONINGFOR MULTIPLE INTERACTIVE SERVICES
Shuang Chen, Christina Delimitrou, José F. Martínez
Cornell University
-
Best-effort
COLOCATION OF APPLICATIONS
Privatecaches
Last-levelCache
Privatecaches
Privatecaches
Privatecaches
P P P P…
Latency-critical
Page 1 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
-
PRIOR WORK§ Interference during colocation§ Scheduling [Nathuji’10, Mars’13, Delimitrou’14]
• Avoid co-scheduling of apps that may interfere- May require offline knowledge- Limit colocation options
§ Resource partitioning [Sanchez’11, Lo’15]• Partition shared resources- At most 1 LC app + multiple best-effort jobs
Page 2 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
-
Best-effort
TRENDS IN DATACENTERS
Latency-critical
Microservices
Monolith 1 LC + many BE
many LC + many BE
Page 3 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
All have QoS targets More LC jobs
-
MAIN CONTRIBUTIONS
§ Workload characterization• The impact of resource sharing• The effectiveness of resource isolation• Relationship between different resources
§ PARTIES: First QoS-aware resource manager for colocation of many LC services• Dynamic partitioning of 9 shared resources• No a priori application knowledge• 61% higher throughput under QoS constraints • Adapts to varying load patterns
Page 1 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
-
Table 1: Latency-Critical Applications
Application Memcached Xapian NGINX Moses MongoDB Sphinx
Domain Key-valuestoreWeb
searchWeb
serverReal-timetranslation
Persistentdatabase
Speechrecognition
Target QoS 600us 5ms 10ms 15ms 300ms 2.5sMax Load 1,280,000 8,000 560,000 2,800 240 14User / Sys /IO CPU% 13 / 78 / 0 42 / 23 / 0 20 / 50 / 0 50 / 14 / 0 0.3 / 0.2 / 57 85 / 0.6 / 0
LLC MPKI 0.55 0.03 0.06 10.48 0.01 6.28MemoryCapacity 9.3 GB 0.02 GB 1.9 GB 2.5 GB 18 GB 1.4 GB
MemoryBandwidth 0.6 GB/s 0.01 GB/s 0.6 GB/s 26 GB/s 0.03 GB/s 3.1 GB/s
DiskBandwidth 0 MB/s 0 MB/s 0 MB/s 0 MB/s 5 MB/s 0 MB/s
NetworkBandwidth 3.0 Gbps 0.07 Gbps 6.2 Gbps 0.001 Gbps 0.01 Gbps 0.001 Gbps
Table 2: Impact of resource interference. Each row corresponds to a type of resource. Values in the table are the maximum percentageof max load for which the server can satisfy QoS when the LC application is running under interference. Cells with smaller numbersand darker colors mean that applications are more sensitive to that type of interference.
Memcached Xapian NGINX Moses MongoDB SphinxHyperthread 0% 0% 0% 0% 90% 0%
CPU 10% 50% 60% 70% 100% 30%Power 60% 80% 90% 90% 100% 70%
LLC Capacity 90% 90% 70% 80% 100% 30%LLC Bandwidth 0% 60% 70% 30% 90% 0%
Memory Bandwidth 0% 50% 40% 10% 70% 0%Memory Capacity 0% 100% 0% 0% 20% 80%Disk Bandwidth 100% 100% 100% 100% 10% 100%
Network Bandwidth 20% 90% 10% 90% 90% 80%
1
INTERACTIVE LC APPLICATIONS
Page 5 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
-
Table 1: Latency-Critical Applications
Application Memcached Xapian NGINX Moses MongoDB Sphinx
Domain Key-valuestoreWeb
searchWeb
serverReal-timetranslation
Persistentdatabase
Speechrecognition
Target QoS 600us 5ms 10ms 15ms 300ms 2.5sMax Load 1,280,000 8,000 560,000 2,800 240 14User / Sys /IO CPU% 13 / 78 / 0 42 / 23 / 0 20 / 50 / 0 50 / 14 / 0 0.3 / 0.2 / 57 85 / 0.6 / 0
LLC MPKI 0.55 0.03 0.06 10.48 0.01 6.28MemoryCapacity 9.3 GB 0.02 GB 1.9 GB 2.5 GB 18 GB 1.4 GB
MemoryBandwidth 0.6 GB/s 0.01 GB/s 0.6 GB/s 26 GB/s 0.03 GB/s 3.1 GB/s
DiskBandwidth 0 MB/s 0 MB/s 0 MB/s 0 MB/s 5 MB/s 0 MB/s
NetworkBandwidth 3.0 Gbps 0.07 Gbps 6.2 Gbps 0.001 Gbps 0.01 Gbps 0.001 Gbps
Table 2: Impact of resource interference. Each row corresponds to a type of resource. Values in the table are the maximum percentageof max load for which the server can satisfy QoS when the LC application is running under interference. Cells with smaller numbersand darker colors mean that applications are more sensitive to that type of interference.
Memcached Xapian NGINX Moses MongoDB SphinxHyperthread 0% 0% 0% 0% 90% 0%
CPU 10% 50% 60% 70% 100% 30%Power 60% 80% 90% 90% 100% 70%
LLC Capacity 90% 90% 70% 80% 100% 30%LLC Bandwidth 0% 60% 70% 30% 90% 0%
Memory Bandwidth 0% 50% 40% 10% 70% 0%Memory Capacity 0% 100% 0% 0% 20% 80%Disk Bandwidth 100% 100% 100% 100% 10% 100%
Network Bandwidth 20% 90% 10% 90% 90% 80%
1
INTERACTIVE LC APPLICATIONS
Page 6 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
Max load: max RPS under QoS target when running alone
-
Table 1: Latency-Critical Applications
Application Memcached Xapian NGINX Moses MongoDB Sphinx
Domain Key-valuestoreWeb
searchWeb
serverReal-timetranslation
Persistentdatabase
Speechrecognition
Target QoS 600us 5ms 10ms 15ms 300ms 2.5sMax Load 1,280,000 8,000 560,000 2,800 240 14User / Sys /IO CPU% 13 / 78 / 0 42 / 23 / 0 20 / 50 / 0 50 / 14 / 0 0.3 / 0.2 / 57 85 / 0.6 / 0
LLC MPKI 0.55 0.03 0.06 10.48 0.01 6.28MemoryCapacity 9.3 GB 0.02 GB 1.9 GB 2.5 GB 18 GB 1.4 GB
MemoryBandwidth 0.6 GB/s 0.01 GB/s 0.6 GB/s 26 GB/s 0.03 GB/s 3.1 GB/s
DiskBandwidth 0 MB/s 0 MB/s 0 MB/s 0 MB/s 5 MB/s 0 MB/s
NetworkBandwidth 3.0 Gbps 0.07 Gbps 6.2 Gbps 0.001 Gbps 0.01 Gbps 0.001 Gbps
Table 2: Impact of resource interference. Each row corresponds to a type of resource. Values in the table are the maximum percentageof max load for which the server can satisfy QoS when the LC application is running under interference. Cells with smaller numbersand darker colors mean that applications are more sensitive to that type of interference.
Memcached Xapian NGINX Moses MongoDB SphinxHyperthread
CPUPower
LLC CapacityLLC Bandwidth
Memory BandwidthMemory CapacityDisk Bandwidth
Network Bandwidth
1
0%Extremely sensitive
100%Not sensitive at all% of max load under QoS target
INTERFERENCE STUDY
• Applications are sensitive to resources with high usage• Applications with strict QoS targets are more sensitive
Page 7 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
-
ISOLATION MECHANISMS
PPrivatecaches
Last-levelCache
Privatecaches
PPrivatecaches
P…
Privatecaches
P
cgroup ACPI frequency driverqdiscIntel CAT
• Core mapping» Hyperthreads» Core counts
• Memory capacity• Disk bandwidth• Core frequency
» Power• LLC capacity
» Cache capacity» Cache bandwidth» Memory bandwidth
• Network bandwidth
Page 8 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
-
RESOURCE FUNGIBILITY
§ Resources are fungible• More flexibility in resource allocation• Simplifies resource manager
1 3 5 7 9 1113151719Cache ways
13579
1113
XXXXX
XXXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
Xapian
1 3 5 7 9 1113151719Cache ways
13579
1113
XXXXXXXXXXX
XXXXXXXXX
XXXXXXXXX
XXXXXXXX
XXXXXXX
XXXXXXX
XXXXXX
XXXXXX
XXXXX
XXXX
XXXX
XXXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XapianStand-alone With memory interference
Page 9 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
-
PARTIES: DESIGN PRINCIPLES
§ PARTIES• PARTitioning for multiple InteractivE Services
§ Design principles• LC apps are equally important• Allocation should be dynamic and fine-grained• No a priori application knowledge or offline profiling• Recover quickly from incorrect decisions• Migration is used as a last resort
Page 10 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
-
Compute Storage
D
M No Benefit
$
C
F No Benefit & Mem Slack < 1GB UP UP
UP
DOWN
Compute Storage
D
M No Benefit
$
C
F No Benefit
DOWN DOWN
Storage
D
M No Benefit
$
C
F No Benefit
Compute
C
$
F
time
Slack
0
20%
PARTIES
LatencyMonitor
Clientside
Serverside
MainFunction
Polllatencyevery100ms
Page 11 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
App2
App1
Unallocatedpool
• 5 knobs organized into 2 wheels• Start from a random resource• Follow the wheels to visit all resources
QoS violations?Upsize!
Excessresources?Downsize!
Compute Storage
D
M No Benefit
$
C
F No Benefit & Mem Slack < 1GB UP UP
UP
DOWN
Compute Storage
D
M No Benefit
$
C
F No Benefit
DOWN DOWN
Storage
D
M No Benefit
$
C
F No Benefit
Compute
C
$
F
Compute Storage
D
M No Benefit
$
C
F No Benefit & Mem Slack < 1GB UP UP
UP
DOWN
Compute Storage
D
M No Benefit
$
C
F No Benefit
DOWN DOWN
Storage
D
M No Benefit
$
C
F No Benefit
Compute
C
$
F DownsizingApp2…
Compute Storage
D
M No Benefit
$
C
F No Benefit & Mem Slack < 1GB UP UP
UP
DOWN
Compute Storage
D
M No Benefit
$
C
F No Benefit
DOWN DOWN
Storage
D
M No Benefit
$
C
F No Benefit
Compute
C
$
F
Compute Storage
D
M No Benefit
$
C
F No Benefit & Mem Slack < 1GB UP UP
UP
DOWN
Compute Storage
D
M No Benefit
$
C
F No Benefit
DOWN DOWN
Storage
D
M No Benefit
$
C
F No Benefit
Compute
C
$
F
Compute Storage
D
M No Benefit
$
C
F No Benefit & Mem Slack < 1GB UP UP
UP
DOWN
Compute Storage
D
M No Benefit
$
C
F No Benefit
DOWN DOWN
Storage
D
M No Benefit
$
C
F No Benefit
Compute
C
$
F
UpsizingApp1…
-
METHODOLOGY§ Platform: Intel E5-2699 v4
• Single socket with 22 cores (8 IRQ cores)§ Virtualization
• LXC 2.0.7§ Load generators
• Open loop• Request inter-arrival distribution: exponential• Request popularity: Zipfian
§ Testing strategy• Constant load: 30s warmup, 1m measurement (x5) • Varying load simulates diurnal load patterns
Page 12 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
-
10 20 30 40 50 60 70 80Max Load of Xapian(%)
70
70
60
30
20
10
60
50
40
20
10
50
40
30
10
40
20
10
30
20
10
20
10
20 10
20
40
60
80
100
10 20 30 40 50 60 70 80Max Load of Xapian(%)
10
20
30
40
50
60
Max
Load
ofM
emca
ched
(%)
60
50
40
30
50
40
30
10
40
20
10
30
10
10
10 20 30 40 50 60 70 80Max Load of Xapian(%)
80
70
60
40
20
10
70
50
40
30
10
50
40
30
20
40
30
20
30
20
10
30
10
20 10
20
40
60
80
100
10 20 30 40 50 60 70 80Max Load of Xapian(%)
10
20
30
40
50
60
Max
Load
ofM
emca
ched
(%)
50
30
10
10
30
20
20 10 10
CONSTANT LOADS: MEMCACHED, XAPIAN & NGINXUnmanaged
Heracles PARTIES
Oracle
Max
LoadofN
GIN
X(%
)M
axLoad
ofNG
INX
(%)
Oracle- Offline profiling- Always finds the
global optimum
Heracles- No partitioning
between BE jobs- Suspend BE upon
QoS violation- No interaction
between resources
Page 13 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
-
MORE EVALUATION
Page 14 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
Constant loads§ All 2- and 3-app mixes under PARTIES§ Comparison with Heracles for 2- to 6-app mixes
Diurnal load pattern§ Colocation of Memcached, Xapian and Moses
PARTIES overhead§ Convergence time for 2- to 6-app mixes
-
CONCLUSIONS
§ Need to manage multiple LC apps§ Insights
• Resource partitioning• Resource fungibility
§ PARTIES• Partition 9 shared resources • No offline knowledge required• 61% higher throughput under QoS targets• Adapts to varying load patterns
Page 15 of 15
Motivation• Characterization• PARTIES• Evaluation • Conclusions
-
PARTIES: QOS-AWARE RESOURCE PARTITIONINGFOR MULTIPLE INTERACTIVE SERVICES
Shuang Chen, Christina Delimitrou, José F. MartínezCornell University
http://tiny.cc/parties