in n-Tier Applications through Fine-Grained Analysis
Transcript of in n-Tier Applications through Fine-Grained Analysis
Detecting Transient Bottlenecks
in n-Tier Applications through
Fine-Grained Analysis
Qingyang Wang, Yasuhiko Kanemasa, Jack Li,
Deepal Jayasinghe, Toshihiro Shimizu,
Masazumi Matsubara, Motoyuki Kawaba,
Calton Pu
Low response time is essential for Quality of Service (e.g., SLA for web-facing e-commerce applications). Amazon reports that every 100ms increase in the
page load decreases sales by 1%.
Achieving low response time at high resource utilization is challenging. Servers in typical data centers are only busy 18%
time on average, wasting power.
Dilemma between
Good Performance and High Utilization
[Dec 2010]
2
0
25
50
75
100
0 1 2 3
Time [s]
Transient bottlenecks (e.g., ten of milliseconds)
can cause wide-range response time variations
for web-facing n-tier applications. [ICAC’12]
An Important Factor: Transient Bottlenecks
Transient bottlenecks (e.g., < 100ms)
Challenging for typical monitoring tools with coarse granularity
Average: 75%
Reso
urc
e u
til.
[%]
3
Experimental Setup
RUBBoS benchmark: a bulletin board
system like Slashdot
24 web interactions
CPU intensive
Workload consists of emulated clients
Intel Xeon E5607
2 quad-core 2.26 GHz
16 GB memory
4
Motivational Example
WL 8,000 WL 8,000
Response time & throughput of a 3-minute benchmark
on the 4-tier application with increasing workloads.
Average response time is low at workload 8,000, how
about response time distribution?
200ms
5
Measured Long-Tail
Response Time Distribution
Response time distribution
at workload 8,000
The percentage of requests
over 2 seconds is more than 5%
# o
f co
mple
ted r
equest
s
Log scale
6
No Resources Are Saturated
CPU util. 79.2% CPU util. 78.1%
Workload is CPU intensive
Disk I/O utilization (<5%), network I/O utilization (<
20%), Memory usage (<40%);
CPU util. 34.6% CPU util. 26.7%
CJDBC
Tom
cat
CPU
util.
[%]
time [s]
MyS
QL C
PU
util.
[%]
time [s]
7
Sources: We find that other than bursty
workload, system environmental conditions:
Dynamic Voltage and Frequency Scaling (DVFS)
JVM garbage collection
Detection and Visualization: We develop a
fine-grained monitoring method based on
passive network tracing in network switches.
Negligible monitoring overhead for running
applications
Transient Bottlenecks:
Sources and Detection
8
Introduction & Motivation
Fine-grained load/throughput analysis method
1. Data collection via a passive network tracing tool
2. Calculation of active-load and throughput
3. Correlation analysis
Two Case Studies Dynamic Voltage and Frequency Scaling (DVFS)
JVM garbage collection
Conclusion & Future Works
Outline
9
Step1: Data Collection
through Passive Network Tracing
Collect interaction messages in the system using
SysViz to measure fine-grained active load and
throughput on each server.
Active load: The # of concurrent requests in a server
Throughput: The # of completed requests of a server
Web AP DB
Network switch
Web server AP server DB server
01:58:20.039 01:58:20.053
01:58:20.072
01:58:20.081
01:58:20.090
01:58:20.124
01:58:20.134
01:58:20.142
01:58:20.154
01:58:20.161
01:58:20.182
01:58:20.193
SysViz box
10
Step2: Fine-Grained Active Load
Calculation in a Server
SysViz
monitoring for
MySQL
50ms 50ms
time arrival
timestamp
departure
timestamp
11
Saturation
area
Active load in a server
Step3: Active-Load/Throughput
Correlation Analysis
Serv
er
thro
ugh
put
Saturation
point N*
TPmax
Non-
Saturation
area
12
0
10
20
30
40
50
60
0 0.5 1 1.5 2
Act
ive
-Lo
ad [
req
]
Time [s]
Fine-grained active-load in MySQL
0
2000
4000
6000
8000
0 0.5 1 1.5 2Thro
ugh
pu
t [r
eq
/s]
Time [s]
Fine-grained throughput in MySQL
Step3: Active-Load/Throughput Analysis
for MySQL at WL 12,000
Active-load &
throughput at
every 50ms
time window
13
0
10
20
30
40
50
60
0 0.5 1 1.5 2
Act
ive
-lo
ad [
req
]
Timeline [s]
Fine-grained active-load in MySQL
0
2000
4000
6000
8000
0 0.5 1 1.5 2
Thro
ugh
pu
t [r
eq
/s]
Timeline [s]
Fine-grained throughput in MySQL
MySQL Active Load
MyS
QL t
hro
ugh
put
Step3: Active-Load/Throughput Analysis
for MySQL at WL 12,000 (Cont.)
14
Three Typical Cases of Active-
Load/Throughput Analysis
Workload
4,000
Workload
16,000
Workload
12,000
Non-saturation
case
Partial
saturation case
Full saturation
case
MySQL active-load
MyS
QL t
hro
ugh
put
MyS
QL t
hro
ugh
put
MySQL active-load
MySQL active-load
MyS
QL t
hro
ugh
put
15
Transient Bottlenecks in the
Partial Saturation Case
3-D demo
Workload
12,000
MySQL active-load
M
ySQ
L t
hro
ugh
put
16
Introduction & Motivation
Fine-grained load/throughput analysis method
1. Data collection via a passive network tracing tool
2. Calculation of active-load and throughput
3. Correlation analysis
Two Case Studies Dynamic Voltage and Frequency Scaling (DVFS)
JVM garbage collection
Conclusion & Future Works
Outline
17
DVFS is designed to adjust CPU frequency
to meet instantaneous performance needs
while minimizing power consumption
We found that the anti-synchrony between
DVFS adjustment period and workload burst
cycles causes frequent transient bottlenecks.
Dell’s BIOS-level DVFS control
Transient Bottlenecks Caused by DVFS
P-state P0 P1 P4 P5 P8
CPU Frequency [MHz] 2261 2128 1729 1596 1197
18
DVFS Off case
MySQL Active load [#]
MyS
QL t
hro
ugh
put
[req/s
]
Transient Bottlenecks
of MySQL at Workload 8,000
DVFS On case
CPU is in low
frequency
CPU is in high
frequency
# concurrent requests in MySQL MySQL Active load [#]
MyS
QL t
hro
ugh
put
[req/s
]
19
Transient Bottlenecks
of MySQL at Workload 10,000
MyS
QL t
hro
ugh
put
[req/s
]
MySQL Active load [#]
DVFS On case
MyS
QL t
hro
ugh
put
[req/s
]
MySQL Active load [#]
DVFS Off case
Low frequency
Median frequency
High frequency High frequency
20
System Response Time with
DVFS On/Off at Workload 10,000
Time [s] Time [s] R
esp
on
se t
ime [
s]
R
esp
on
se t
ime [
s]
DVFS On case DVFS Off case
21
Introduction & Motivation
Fine-grained load/throughput analysis method
1. Data collection via a passive network tracing tool
2. Calculation of active-load and throughput
3. Correlation analysis
Two Case Studies Dynamic Voltage and Frequency Scaling (DVFS)
JVM garbage collection
Conclusion & Future Works
Outline
22
Transient bottlenecks can cause long-tail response time distributions of an n-tier application.
We developed a fine-grained active-load/throughput analysis method which can detect and visualize transient bottlenecks.
Ongoing work: more analysis of different types of workloads and more system factors that cause transient bottlenecks.
Conclusion & Future Work
23