1 Web Server Performance in a WAN Environment Vincent W. Freeh Computer Science North Carolina State...
-
Upload
lesley-obrien -
Category
Documents
-
view
217 -
download
0
Transcript of 1 Web Server Performance in a WAN Environment Vincent W. Freeh Computer Science North Carolina State...
1
Web Server Performance in aWAN Environment
Vincent W. FreehComputer Science
North Carolina State
Vsevolod V. PanteleenkoComputer Science & Engineering
University of Notre Dame
2
Large web site
Complex design and interaction
Multiple tiers Appliance Web, app, & DB servers
Study performance of web server Cached pages
Most testing Simulated load LAN environment
Our evaluation adds Simulated WAN environment Small MTU, BW limits, latency Shows some optimization
aren’t
Appliance Web Servers Application Servers
Database Servers
Clients
3
Evaluating a web server
Three parts Measuring the server Loading the server Supporting the server
Appliance Web Servers Application Servers
Database Servers
Clients
Net
Server load
Server demand
Tiers2&3
4
Two ways to load server
Synthetic load Controlled Reproducible Flexible Only as good as assumptions, mechanisms Hard to replicate real world
Real-world load Uncontrolled Not reproducible (can use traces) Accurate model of system Hard to produce extreme or rare conditions
Discussion Need both Validate simulations with real-world tests
Net
5
Loading the server
Our tests use synthetic load Three load-generating tools Micro-benchmarking tool
Requests a single object at a constant rate Tests delivery of static, cached documents Establishes base line
Net
6
Modified SURGE
SURGE Scalable URL reference generator Barford & Crovella, U Boston Emulates statistical distribution
Object & request sizeObject popularityEmbedded object referencesTemporal localityUse idle periods
Modifications Converted from process based to event based
To increase number of clients Server-throttling problem eliminated
Net
7
Delays and limits
Emulate WAN parameters in a LAN Network delays Bandwidth limits
Modified kernel and protocol stack Separate delay queue per TCP connection Necessary for accurate emulation More accurate than Dummynet & NISTnet (per interface)
Net
8
Measuring a web server
OS
Network
HTTP
request reply
drivers
TCP/IP
Apache, TUX
9
Measuring a web server
OS
Network
HTTP
request reply
Measure utilization usingHW performance counters
10
Test environment
OS: Linux 2.4.8 Node: (server & clients)
Pentium III, 650MHz 512MB main memory
NIC: 3COM 3C590 100 Mbps ethernet Direct connect
Software: Client: microbenchmarking, SURGE, delay/limits Server: Apache, Tux
Warmed client No cache misses
Client
Client
Server
NIC
NIC
NIC
NIC
11
Cost breakdown – file size, Apache
Majority of time in interrupt (recv’g)But most data is sent.
•MTU = 536 bytes•Delay = 200 ms•BW = 56 Kbps•Data send rate = 3MB/s
12
Cost breakdown - file size, TUX
•Twice data send rate as Apache.•Essentially all cost in interrupts.
•MTU = 536 bytes•Delay = 200 ms•BW = 56 Kbps•Data send rate = 6 MB/s
13
Apache versus TUX
Apache TUX
Server send rate 3.0 MB/s 6.0 MB/s
Packets rec’d / s 5738 11,991
Packets sent / s 6156 11,878
Interrupts / s 7482 13,974
Concurrent connections 784 1451
14
Cost breakdown vs. MTU
Surge parameters•Size = 10 KB•Delay = 200 ms•BW = 56 Kbps•Data send rate = 6 MB/s
15
Effects of network delay
Surge parameters•MTU = 536 bytes•Size = 10 KB•BW = 56 Kbps•Data send rate = 6 MB/s
16
Effects of bandwidth limits
Surge parameters•MTU = 536 bytes•Size = 10 KB•Delay = 200 ms•Data send rate = 6 MB/s
20% decrease in overhead from 28kbps to infinity
17
Persistent connections
Surge parameters•MTU = 536 bytes•Size = 10 KB•Delay = 200 ms•Size = 10 KB•Data send rate = 6 MB/s
10% decrease going from 1 to 16 requests per connection
18
Copy and checksumming
0
0.1
0.2
0.3
0.4
0.5
0.6
zero copy & HW
checksumming
copy & checksum
kernel mode
socket write
soft intr.
hard intr.
Surge parameters•MTU = 536 bytes•Size = 10 KB•Delay = 200 ms•Size = 10 KB•Data send rate = 6 MB/s
19
Re-assess value of some optimizations
Copy & checksumming avoidance LAN: 25-111% copy or 21-33% copy & 10-15% checksum WAN: 10% combined
Select optimization LAN: 28% WAN: < 10%
Connection open/close avoidance (HTTP 1.1) LAN: “greatly”, “significantly” WAN: < 10%
20
Conclusion
Most processing in protocol stack and drivers Small MTU size increases processing cost Little effect from
Network delay Bandwidth limitations Persistent connections
End-user request latency depends Primarily on connection bandwidth Secondarily on network delay
Future Dynamic & uncached pages Add packet loss
Work supported by IBM UPP & NSF CCR9876073
www.csc.ncsu.edu/faculty/freeh/
21
End
22
Persistent connections - packets/s
23
Number of Packets vs. MTU
24
Web (HTTP) servers
Apache Largest install base User space Process-based model
TUX Niche server Kernel space Event-based model Aggressive
optimizations Copy/checksum
avoidance Object, name caching
25
Measuring a web server
OS
Network
HTTP
request reply
26
Interrupt coalescing
0
0.1
0.2
0.3
0.4
0.5
0.6
Apache:normal
Apache: intr.coalescing
TUX: normal TUX: intr.coalescing
CP
U u
tiliz
atio
n
user modekernel mode
socket w ritesoft. intr.hard. intr.
Decreases interrupt scheduling overhead Interrupt every 2 ms