Presentation (PowerPoint File)

Web Servers: Implementation and Performance

Erich Nahum 1

Web Servers: Implementation and

Performance

Erich Nahum

IBM T.J. Watson Research Centerwww.research.ibm.com/people/n/nahum [email protected]


Erich Nahum 2

Contents and Timeline:

• Introduction to the Web (30 min):– HTTP, Clients, Servers, Proxies, DNS, CDN’s

• Outline of a Web Server Transaction (25 min):– Receiving a request, generating a response

• Web Server Architectural Models (20 min):– Processes, threads, events

• Web Server Workload Characteristics (30 min):– File sizes, document popularity, embedded objects

• Web Server Workload Generation (20 min):– Webstone, SpecWeb, TPC-W


Erich Nahum 3

Things Not Covered in Tutorial

• Client-side issues: HTML rendering, Javascript interpretation

• TCP issues: implementation, interaction with HTTP

• Proxies: some similarities, many differences• Dynamic Content: CGI, PHP, EJB, ASP, etc.• QoS for Web Servers• SSL/TLS and HTTPS• Content Distribution Networks (CDN’s)• Security and Denial of Service


Erich Nahum 4

Assumptions and Expectations

• Some familiarity with WWW as a user(Has anyone here not used a browser?)

• Some familiarity with networking concepts(e.g., unreliability, reordering, race conditions)

• Familiarity with systems programming(e.g., know what sockets, hashing, caching are)

• Examples will be based on C & Unix taken from BSD, Linux, AIX, and real servers(sorry, Java and Windows fans)


Erich Nahum 5

Objectives and Takeaways

• Basics of the Web, HTTP, clients, servers, DNS• Basics of server implementation & performance• Pros and cons of various server architectures• Characteristics of web server workloads• Difficulties in workload generation• Design loop of implement, measure, debug, and

fix

After this tutorial, hopefully we will all know:

Many lessons should be applicable to any networked server, e.g., files, mail, news, DNS, LDAP, etc.


Erich Nahum 6

Acknowledgements

Many people contributed comments and suggestions to this tutorial, including:

Abhishek ChandraMark CrovellaSuresh ChariPeter DruschelJim Kurose

Balachander KrishnamurthyVivek PaiJennifer RexfordAnees ShaikhSrinivasan Seshan

Errors are all mine, of course.


Erich Nahum 7

Chapter 1: Introduction to the World-Wide Web (WWW)


Erich Nahum 8

Introduction to the WWW

• HTTP: Hypertext Transfer Protocol– Communication protocol between clients and servers– Application layer protocol for WWW

• Client/Server model:– Client: browser that requests, receives, displays object– Server: receives requests and responds to them– Proxy: intermediary that aggregates requests, responses

• Protocol consists of various operations– Few for HTTP 1.0 (RFC 1945, 1996)– Many more in HTTP 1.1 (RFC 2616, 1999)

Client ProxyServer

http request http request

http response http response


Erich Nahum 9

How are Requests Generated?

• User clicks on something • Uniform Resource Locator (URL):

– http://www.nytimes.com– https://www.paymybills.com– ftp://ftp.kernel.org– news://news.deja.com– telnet://gaia.cs.umass.edu– mailto:[email protected]

• Different URL schemes map to different services• Hostname is converted from a name to a 32-bit

IP address (DNS resolve)• Connection is established to server

Most browser requests are HTTP requests.


Erich Nahum 10

How are DNS names resolved?

• Clients have a well-known IP address for a local DNS name server

• Clients ask local name server for IP address• Local name server may not know it, however!• Name server has, in turn, a parent to ask (the

“DNS hierarchy”)• The local name server’s job is to iteratively query

servers until name is found and return IP address to server

• Each name server can cache names, but:– Each name:IP mapping has a time-to-live field– After time expires, name server must discard mapping


Erich Nahum 11

DNS in Actionmyclient.watson.ibm.com

ns.watson.ibm.com(name server)

A.GTLD-SERVER.NET(name server for .edu)

www.ipam.ucla.edu?

www.ipam.ucla.edu?

ns.ucla.edu (TTL = 1d)

12.100.104.5 (TTL = 10 min)

ns.ucla.edu(name server)

12.100.104.512.100.104.5

www.ipam.ucla.edu

www.ipam.ucla.edu?

GET /index.html

200 OK index.html


Erich Nahum 12

What Happens Then?• Client downloads HTML document

– Sometimes called “container page”– Typically in text format (ASCII)– Contains instructions for rendering

(e.g., background color, frames)

– Links to other pages

• Many have embedded objects:– Images: GIF, JPG (logos, banner ads)– Usually automatically retrieved

• I.e., without user involvement• can control sometimes

(e.g. browser options, junkbuster)

<html><head><meta name=“Author” content=“Erich Nahum”><title> Linux Web Server Performance </title></head><body text=“#00000”><img width=31 height=11 src=“ibmlogo.gif”><img src=“images/new.gif><h1>Hi There!</h1>Here’s lots of cool linux stuff!<a href=“more.html”>Click here</a>for more!</body></html>

sample html file


Erich Nahum 13

So What’s a Web Server Do?

• Respond to client requests, typically a browser– Can be a proxy, which aggregates client requests (e.g., AOL)– Could be search engine spider or custom (e.g., Keynote)

• May have work to do on client’s behalf:– Is the client’s cached copy still good?– Is client authorized to get this document?– Is client a proxy on someone else’s behalf?– Run an arbitrary program (e.g., stock trade)

• Hundreds or thousands of simultaneous clients• Hard to predict how many will show up on some day• Many requests are in progress concurrently

Server capacity planning is non-trivial.


Erich Nahum 14

What do HTTP Requests Look Like?

GET /images/penguin.gif HTTP/1.0User-Agent: Mozilla/0.9.4 (Linux 2.2.19)Host: www.kernel.orgAccept: text/html, image/gif, image/jpegAccept-Encoding: gzipAccept-Language: enAccept-Charset: iso-8859-1,*,utf-8Cookie: B=xh203jfsf; Y=3sdkfjej<cr><lf>

• Messages are in ASCII (human-readable)• Carriage-return and line-feed indicate end of headers• Headers may communicate private information

(browser, OS, cookie information, etc.)


Erich Nahum 15

What Kind of Requests are there?

Called Methods:• GET: retrieve a file (95% of requests)• HEAD: just get meta-data (e.g., mod time)• POST: submitting a form to a server• PUT: store enclosed document as URI• DELETE: removed named resource• LINK/UNLINK: in 1.0, gone in 1.1• TRACE: http “echo” for debugging (added in 1.1)• CONNECT: used by proxies for tunneling (1.1)• OPTIONS: request for server/proxy options (1.1)


Erich Nahum 16

What Do Responses Look Like?

HTTP/1.0 200 OKServer: Tux 2.0Content-Type: image/gifContent-Length: 43Last-Modified: Fri, 15 Apr 1994 02:36:21 GMTExpires: Wed, 20 Feb 2002 18:54:46 GMTDate: Mon, 12 Nov 2001 14:29:48 GMTCache-Control: no-cachePragma: no-cacheConnection: closeSet-Cookie: PA=wefj2we0-jfjf<cr><lf><data follows…>

• Similar format to requests (i.e., ASCII)


Erich Nahum 17

What Responses are There?

• 1XX: Informational (def’d in 1.0, used in 1.1)100 Continue, 101 Switching Protocols

• 2XX: Success 200 OK, 206 Partial Content

• 3XX: Redirection 301 Moved Permanently, 304 Not Modified

• 4XX: Client error 400 Bad Request, 403 Forbidden, 404 Not Found

• 5XX: Server error 500 Internal Server Error, 503 Service Unavailable, 505 HTTP Version Not Supported


Erich Nahum 18

What are all these Headers?

• General: Connection, Date

• Request: Accept-Encoding, User-Agent

• Response: Location, Server type

• Entity: Content-Encoding, Last-Modified

• Hop-by-hop: Proxy-Authenticate, Transfer-Encoding

Specify capabilities and properties:

Server must pay attention to respond properly.


Erich Nahum 19

The Role of Proxies

clients

serversproxy

Internet

• Clients send requests to local proxy• Proxy sends requests to remote servers• Proxy can cache responses and return them


Erich Nahum 20

Why have a Proxy?

• For performance:– Many of the same web documents are requested by many

different clients (“locality of reference”)– A copy of the document can be cached for later requests

(typical document hit rate: ~ 50%)– Since proxy is closer to client, responses times are smaller

than from server

• For cost savings:– Organizations pay by ISP bandwidth used– Cached responses don’t consume ISP bandwidth

• For security/policy:– Typically located in “demilitarized zone” (DMZ)– Easier to protect a single point rather than all clients– Can enforce corporate/government policies (e.g., porn)


Erich Nahum 21

Proxy Placement in the Web

clients

servers

proxy

Internet

proxy

proxy “reverse”proxy

• Proxies can be placed in arbitrary points in net:– Can be organized into hierarchies– Placed in front of a server: “reverse” proxy– Route requests to specific proxies: content distribution


Erich Nahum 22

Content Distribution Networks

originservers

proxy

Internet

proxy

proxy

• Push content out to proxies:– Route client requests to “closest” proxy– Reduce load on origin server– Reduce response time seen by client


Erich Nahum 23

Mechanisms for CDN’s• IP Anycast:

– Route an IP packet to one-of-many IP addresses– Some research but not deployed or supported by IPV4

• TCP Redirection:– Client TCP packets go to one machine, but responses come from a

different one– Clunky, not clear it reduces load or response time

• HTTP Redirection:– When client connects, use 302 response (moved temp) to send

client to proxy close to client– Server must be aware of CDN network

• DNS Redirection:– When client asks for server IP address, tell them based on where

they are in the network– Used by most CDN providers (e.g., Akamai)


Erich Nahum 24

DNS Based Request-Routing

local nameserver

client

request-routing DNSname server

www.service.com

service.com?

cdn 2

cdn 4cdn 3

cdn 1

cdn 5

service.com?

cdn 3

cdn 3


Erich Nahum 25

Summary: Introduction to WWW

• The major application on the Internet– Majority of traffic is HTTP (or HTTP-related)– Messages mostly in ASCII text (helps debugging!)

• Client/server model:– Clients make requests, servers respond to them– Proxies act as servers to clients, clients to servers

• Content may be spread across network– Through either proxy caches or content distr. networks– DNS redirection is the common approach to CDNs

• Various HTTP headers and commands– Too many to go into detail here– We’ll focus on common server ones– Many web books/tutorials exist (e.g., Krishnamurthy & Rexford 2001)


Erich Nahum 26

Chapter 2: Outline of a Typical Web Server

Transaction


Erich Nahum 27

Outline of an HTTP Transaction

• In this section we go over the basics of servicing an HTTP GET request from user space

• For this example, we'll assume a single process running in user space, similar to Apache 1.3

• At each stage see what the costs/problems can be

• Also try to think of where costs can be optimized

• We’ll describe relevant socket operations as we go

initialize;forever do { get request; process; send response; log request;}

server ina nutshell


Erich Nahum 28

Readying a Server

• First thing a server does is notify the OS it is interested in WWW server requests; these are typically on TCP port 80. Other services use different ports (e.g., SSL is on 443)

• Allocate a socket and bind()'s it to the address (port 80)• Server calls listen() on the socket to indicate willingness

to receive requests• Calls accept() to wait for a request to come in (and

blocks)• When the accept() returns, we have a new socket which

represents a new connection to a client

s = socket(); /* allocate listen socket */bind(s, 80); /* bind to TCP port 80 */listen(s); /* indicate willingness to accept */while (1) { newconn = accept(s); /* accept new connection */b


Erich Nahum 29

Processing a Request

• getsockname() called to get the remote host name– for logging purposes (optional, but done by most)

• gethostbyname() called to get name of other end – again for logging purposes

• gettimeofday() is called to get time of request– both for Date header and for logging

• read() is called on new socket to retrieve request• request is determined by parsing the data

– “GET /images/jul4/flag.gif”

remoteIP = getsockname(newconn);remoteHost = gethostbyname(remoteIP);gettimeofday(currentTime);read(newconn, reqBuffer, sizeof(reqBuffer));reqInfo = serverParse(reqBuffer);


Erich Nahum 30

Processing a Request (cont)

• stat() called to test file path – to see if file exists/is accessible– may not be there, may only be available to certain people– "/microsoft/top-secret/plans-for-world-domination.html"

• stat() also used for file meta-data– e.g., size of file, last modified time– "Have plans changed since last time I checked?“

• might have to stat() multiple files just to get to end – e.g., 4 stats in bill g example above

• assuming all is OK, open() called to open the file

fileName = parseOutFileName(requestBuffer);fileAttr = stat(fileName);serverCheckFileStuff(fileName, fileAttr);open(fileName);


Erich Nahum 31

Responding to a Request

• read() called to read the file into user space• write() is called to send HTTP headers on socket

(early servers called write() for each header!)

• write() is called to write the file on the socket• close() is called to close the socket• close() is called to close the open file descriptor• write() is called on the log file

read(fileName, fileBuffer);headerBuffer = serverFigureHeaders(fileName, reqInfo);write(newSock, headerBuffer);write(newSock, fileBuffer);close(newSock);close(fileName);write(logFile, requestInfo);


Erich Nahum 32

Optimizing the Basic Structure

• As we will see, a great deal of locality exists in web requests and web traffic.

• Much of the work described above doesn't really need to be performed each time.

• Optimizations fall under 2 categories: caching and custom OS primitives.


Erich Nahum 33

Optimizations: Caching

• Again, cache HTTP header info on a per-url basis, rather than re-generating info over and over.

fileDescriptor = lookInFDCache(fileName);metaInfo = lookInMetaInfoCache(fileName);headerBuffer = lookInHTTPHeaderCache(fileName);

Idea is to exploit locality in client requests. Many files are requested over and over (e.g., index.html).

• Why open and close files over and over again? Instead, cache open file FD’s, manage them LRU.

• Why stat them again and again? Cache path name and access characteristics.


Erich Nahum 34

Optimizations: Caching (cont)

• Instead of reading and writing the data, cache data, as well as meta-data, in user space

fileData = lookInFileDataCache(fileName);fileData = lookInMMapCache(fileName);remoteHostName = lookRemoteHostCache(fileName);

• Since we see the same clients over and over, cache the reverse name lookups (or better yet, don't do resolves at all, log only IP addresses)

• Even better, mmap() the file so that two copies don't exist in both user and kernel space


Erich Nahum 35

Optimizations: OS Primitives

• Rather than call accept(), getsockname() & read(), add a new primitive, acceptExtended(), which combines the 3 primitives

acceptExtended(listenSock,&newSock, readBuffer, &remoteInfo);

currentTime = *mappedTimePointer;

buffer[0] = firstHTTPHeader;buffer[1] = secondHTTPHeader;buffer[2] = fileDataBuffer;writev(newSock, buffer, 3);

• Instead of calling write() many times, use writev()

• Instead of calling gettimeofday(), use a memory-mapped counter that is cheap to access (a few instructions rather than a system call)


Erich Nahum 36

OS Primitives (cont)• Rather than calling read() & write(), or write() with

an mmap()'ed file, use a new primitive called sendfile() (or transmitfile()). Bytes stay in the kernel.

httpInfo = cacheLookup(reqBuffer);sendfile(newConn,

httpInfo->headers, httpInfo->fileDescriptor, OPT_CLOSE_WHEN_DONE);

• Also add an option to close the connection so that we don't have to call close() explicitly.

• While we're at it, add a header option to sendfile() so that we don't have to call write() at all.

All this assumes proper OS support. Most have it these days.


Erich Nahum 37

An Accelerated Server Example

• acceptex() is called– gets new socket, request, remote host IP address

• string match in hash table is done to parse request– hash table entry contains relevant meta-data, including modification

times, file descriptors, permissions, etc.

• sendfile() is called – pre-computed header, file descriptor, and close option

• log written back asynchronously (buffered write()).

That’s it!

acceptex(socket, newConn, reqBuffer, remoteHostInfo);httpInfo = cacheLookup(reqBuffer);sendfile(newConn, httpInfo->headers,

httpInfo->fileDescriptor, OPT_CLOSE_WHEN_DONE);write(logFile, requestInfo);


Erich Nahum 38

Complications

• Much of this assumes sharing is easy:– but, this is dependent on the server architectural model– if multiple processes are being used, as in Apache, it is

difficult to share data structures.

• Take, for example, mmap():– mmap() maps a file into the address space of a process. – a file mmap'ed in one address space can’t be re-used for

a request for the same file served by another process.– Apache 1.3 does use mmap() instead of read().– in this case, mmap() eliminates one data copy versus a

separate read() & write() combination, but process will still need to open() and close() the file.


Erich Nahum 39

Complications (cont)

• Similarly, meta-data info needs to be shared:– e.g., file size, access permissions, last modified time, etc.

• While locality is high, cache misses can and do happen sometimes:– if previously unseen file requested, process can block

waiting for disk.

• OS can impose other restrictions:– e.g., limits on number of open file descriptors. – e.g., sockets typically allow buffering about 64 KB of data.

If a process tries to write() a 1 MB file, it will block until other end receives the data.

• Need to be able to cope with the misses without slowing down the hits


Erich Nahum 40

Summary: Outline of a Typical HTTP Transaction

• A server can perform many steps in the process of servicing a request

• Different actions depending on many factors:– e.g., 304 not modified if client's cached copy is good– e.g., 404 not found, 401 unauthorized

• Most requests are for small subset of data: – we’ll see more about this in the Workload section– we can leverage that fact for performance

• Architectural model affects possible optimizations– we’ll go into this in more detail in the next section


Erich Nahum 41

Chapter 3: Server Architectural Models


Erich Nahum 42

Server Architectural Models

Several approaches to server structure:• Process based: Apache, NCSA• Thread-based: JAWS, IIS• Event-based: Flash, Zeus• Kernel-based: Tux, AFPA, ExoKernel

We will describe the advantages and disadvantages of each.

Fundamental tradeoffs exist between performance, protection, sharing, robustness, extensibility, etc.


Erich Nahum 43

Process Model (ex: Apache)

• Process created to handle each new request:– Process can block on appropriate actions, (e.g., socket read, file read, socket write)– Concurrency handled via multiple processes

• Quickly becomes unwieldy:– Process creation is expensive. – Instead, pre-forked pool is created.– Upper limit on # of processes is enforced

• First by the server, eventually by the operating system.• Concurrency is limited by upper bound


Erich Nahum 44

Process Model: Pros and Cons

• Advantages: – Most importantly, consistent with programmer's way of

thinking. Most programmers think in terms of linear series of steps to accomplish task.

– Processes are protected from one another; can't nuke data in some other address space. Similarly, if one crashes, others unaffected.

• Disadvantages:– Slow. Forking is expensive, allocating stack, VM data

structures for each process adds up and puts pressure on the memory system.

– Difficulty in sharing info across processes.– Have to use locking.– No control over scheduling decisions.


Erich Nahum 45

Thread Model (Ex: JAWS)

• Use threads instead of processes. Threads consume fewer resources than processes (e.g., stack, VM allocation).

• Forking and deleting threads is cheaper than processes.

• Similarly, pre-forked thread pool is created. May be limits to numbers but hopefully less of an issue than with processes since fewer resources required.


Erich Nahum 46

Thread Model: Pros and Cons

• Advantages: – Faster than processes. Creating/destroying cheaper.– Maintains programmer's way of thinking.– Sharing is enabled by default.

• Disadvantages: – Less robust. Threads not protected from each other.– Requires proper OS support, otherwise, if one thread

blocks on a file read, will block all the address space.– Can still run out of threads if servicing many clients

concurrently.– Can exhaust certain per-process limits not encountered

with processes (e.g., number of open file descriptors). – Limited or no control over scheduling decisions.


Erich Nahum 47

Event Model (Ex: Flash)

• Use a single process and deal with requests in a event-driven manner, like a giant switchboard.

• Use non-blocking option (O_NDELAY) on sockets, do everything asynchronously, never block on anything, and have OS notify us when something is ready.

while (1) { accept new connections until none remaining; call select() on all active file descriptors; for each FD: if (fd ready for reading) call read(); if (fd ready for writing) call write(); }


Erich Nahum 48

Event-Driven: Pros and Cons• Advantages:

– Very fast. – Sharing is inherent, since there’s only one process.– Don't even need locks as in thread models. – Can maximize concurrency in request stream easily. – No context-switch costs or extra memory consumption.– Complete control over scheduling decisions.

• Disadvantages: – Less robust. Failure can halt whole server. – Pushes per-process resource limits (like file descriptors). – Not every OS has full asynchronous I/O, so can still block on a

file read. Flash uses helper processes to deal with this (AMPED architecture).


Erich Nahum 49

In-Kernel Model (Ex: Tux)

• Dedicated kernel thread for HTTP requests:– One option: put whole server in kernel. – More likely, just deal with static GET requests in

kernel to capture majority of requests. – Punt dynamic requests to full-scale server in user

space, such as Apache.

TCP

HTTP

IP

ETH

SOCK

user/ kernel boundary

user-space server

kernel-space server

TCP

IP

ETH

HTTP

user/ kernel boundary


Erich Nahum 50

In-Kernel Model: Pros and Cons

• In-kernel event model:– Avoids transitions to user space, copies across u-k boundary, etc. – Leverages already existing asynchronous primitives in the kernel

(kernel doesn't block on a file read, etc.)• Advantages:

– Extremely fast. Tight integration with kernel.– Small component without full server optimizes common case.

• Disadvantages: – Less robust. Bugs can crash whole machine, not just server. – Harder to debug and extend, since kernel programming required,

which is not as well-known as sockets.– Similarly, harder to deploy. APIs are OS-specific (Linux, BSD,

NT), whereas sockets & threads are (mostly) standardized.– HTTP evolving over time, have to modify kernel code in

response.


Erich Nahum 51

So What’s the Performance?

• Graph shows server throughput for Tux, Flash, and Apache.• Experiments done on 400 MHz P/II, gigabit Ethernet, Linux 2.4.9-ac10, 8 client

machines, WaspClient workload generator• Tux is fastest, but Flash close behind


Erich Nahum 52

Summary: Server Architectures

• Many ways to code up a server– Tradeoffs in speed, safety, robustness, ease of programming

and extensibility, etc.

• Multiple servers exist for each kind of model– Not clear that a consensus exists.

• Better case for in-kernel servers as devicese.g. reverse proxy accelerator, Akamai CDN node

• User-space servers have a role:– OS should provides proper primitives for efficiency– Leave HTTP-protocol related actions in user-space– In this case, event-driven model is attractive

• Key pieces to a fast event-driven server: – Minimize copying– Efficient event notification mechanism


Erich Nahum 53

Chapter 5: Workload Characterization


Erich Nahum 54

Workload Characterization

• Why Characterize Workloads?– Gives an idea about traffic behavior ("Which documents are users interested in?")– Aids in capacity planning ("Is the number of clients increasing over time?")– Aids in implementation ("Does caching help?")

• How do we capture them ?– Through server logs (typically enabled)– Through packet traces (harder to obtain and to process)


Erich Nahum 55

Factors to Consider

• Where do I get logs from?– Client logs give us an idea, but not necessarily the

same– Same for proxy logs– What we care about is the workload at the server

• Is trace representative?– Corporate POP vs. News vs. Shopping site

• What kind of time resolution?– e.g., second, millisecond, microsecond

• Does trace/log capture all the traffic?– e.g., incoming link only, or one node out of a cluster

client? proxy? server?


Erich Nahum 56

Probability Refresher

• Lots of variability in workloads– Use probability distributions to express– Want to consider many factors

• Some terminology/jargon:– Mean: average of samples– Median : half are bigger, half are smaller– Percentiles: dump samples into N bins (median is 50th percentile number)

• Heavy-tailed: – As x->infinity

acxxX ]Pr[


Erich Nahum 57

Important Distributions

Some Frequently-Seen Distributions:

• Normal: – (avg. sigma, variance mu)

• Lognormal:– (x >= 0; sigma > 0)

• Exponential: – (x >= 0)

• Pareto: – (x >= k, shape a, scale k)

2)(

)2/()( 22

xe

xf

2)(

)2/())(ln( 22

x

exf

x

xexf )(

)1(/)( aa xakxf


Erich Nahum 58

More Probability

• Graph shows 3 distributions with average = 2.• Note average median in some cases !• Different distributions have different “weight” in tail.


Erich Nahum 59

What Info is Useful?

• Request methods– GET, POST, HEAD, etc.

• Response codes – success, failure, not-modified, etc.

• Size of requested files• Size of transferred objects• Popularity of requested files• Numbers of embedded objects• Inter-arrival time between requests• Protocol support (1.0 vs. 1.1)


Erich Nahum 60

Sample Logs for Illustration

Name: Chess1997

Olympics1998

IBM1998

IBM2001

Description: Kasparov-Deep Blue Event Site

Nagano 1998 Olympics Event Site

Corporate Presence

Corporate Presence

Period: 2 weeks inMay 1997

2 days inFeb 1998

1 day inJune 1998

1 day inFeb 2001

Hits: 1,586,667 5,800,000 11,485,600 12,445,739

Bytes: 14,171,711 10,515,507 54,697,108 28,804,852

Clients: 256,382 80,921 86,0211 319,698

URLS: 2,293 30,465 15,788 42,874

We’ll use statistics generated from these logs as examples.


Erich Nahum 61

Request Methods

• KR01: "overwhelming majority" are GETs, few POSTs• IBM2001 trace starts seeing a few 1.1 methods (CONNECT, OPTIONS, LINK), but still very small

(1/10^5 %)

Chess 1997

Olympics 1998

IBM 1998

IBM 2001

GET 96% 99.6% 99.3% 97%

HEAD 04% 00.3 % 00.08% 02%

POST 00.007%

00.04 % 00.02% 00.2%

Others: noise noise noise noise


Erich Nahum 62

Response Codes

• Table shows percentage of responses. • Majority are OK and NOT_MODIFIED.• Consistent with numbers from AW96, KR01.

Code Meaning Chess

1997

Olympics

1998

IBM

1998

IBM

2001

200

204

206

301

302

304

400

401

403

404

407

500

501

503

???

OK

NO_CONTENT

PARTIAL_CONTENT

MOVED_PERMANENTLY

MOVED_TEMPORARILY

NOT_MODIFIED

BAD_REQUEST

UNAUTHORIZED

FORBIDDEN

NOT_FOUND

PROXY_AUTH

SERVER_ERROR

NOT_IMPLEMENTED

SERVICE_UNAVAIL

UNKNOWN

85.32

--.--

00.25

00.05

00.05

13.73

00.001

--.—-

00.01

00.55

--.--

--.--

--.--

--.--

00.0003

76.02

--.--

--.--

--.--

00.05

23.24

00.0001

00.001

00.02

00.64

--.--

00.003

00.0001

--.--

00.00004

75.28

00.00001

--.--

--.--

01.18

22.84

00.003

00.0001

00.01

00.65

--.--

00.006

00.0005

00.0001

00.005

67.72

--.--

--.--

--.--

15.11

16.26

00.001

00.001

00.009

00.79

00.002

00.07

00.006

00.0003

00.0004


Erich Nahum 63

Resource (File) Sizes

• Shows file/memory usage (not weighted by frequency!)• Lognormal body, consistent with results from AW96, CB96, KR01.• AW96, CB96: sizes have Pareto tail; Downey01: Sizes are lognormal.


Erich Nahum 64

Tails from the File Size

• Shows the complementary CDF (CCDF) of file sizes.• Haven’t done the curve fitting but looks Pareto-ish.


Erich Nahum 65

Response (Transfer) Sizes

• Shows network usage (weighted by frequency of requests)• Lognormal body, pareto tail, consistent with CBC95, AW96,

CB96, KR01


Erich Nahum 66

Tails of Transfer Size

• Shows the complementary CDF (CCDF) of file sizes.• Looks more Pareto-like; certainly some big transfers.


Erich Nahum 67

Resource Popularity

• Follows a Zipf model: p(r) = r^{-alpha} (alpha = 1 true Zipf; others “Zipf-like")

• Consistent with CBC95, AW96, CB96, PQ00, KR01• Shows that caching popular documents is very effective


Erich Nahum 68

Number of Embedded Objects

• Mah97: avg 3, 90% are 5 or less• BC98: pareto distr, median 0.8, mean 1.7• Arlitt98 World Cup study: median 15 objects,

90% are 20 or less• MW00: median 7-17, mean 11-18, 90% 40 or

less• STA00: median 5,30 (2 traces), 90% 50 or less• Mah97, BC98, SCJO01: embedded objects tend

to be smaller than container objects• KR01: median is 8-20, pareto distribution

Trend seems to be that number is increasing over time.


Erich Nahum 69

Session Inter-Arrivals

• Inter-arrival time between successive requests – “Think time"– difference between user requests vs. ALL requests– partly depends on definition of boundary

• CB96: variability across multiple timescales, "self-similarity", average load very different from peak or heavy load

• SCJO01: log-normal, 90% less than 1 minute.• AW96: independent and exponentially

distributed• KR01: pareto with a=1.5, session arrivals follow

poisson distribution, but requests follow pareto


Erich Nahum 70

Protocol Support

• IBM.com 2001 logs:– Show roughly 53% of client requests are 1.1

• KA01 study:– 92% of servers claim to support 1.1 (as of Sep 00)– Only 31% actually do; most fail to comply with spec

• SCJO01 show:– Avg 6.5 requests per persistent connection– 65% have 2 connections per page, rest more. – 40-50% of objects downloaded by persistent

connections

Appears that we are in the middle of a slow transition to 1.1


Erich Nahum 71

Summary: Workload Characterization

• Traffic is variable:– Responses vary across multiple orders of magnitude

• Traffic is bursty:– Peak loads much larger than average loads

• Certain files more popular than others– Zipf-like distribution captures this well

• Two-sided aspect of transfers:– Most responses are small (zero pretty common)– Most of the bytes are from large transfers

• Controversy over Pareto/log-normal distribution• Non-trivial for workload generators to replicate


Erich Nahum 72

Chapter 6: Workload Generators


Erich Nahum 73

Why Workload Generators?• Allows stress-testing and bug-

finding• Gives us some idea of server

capacity• Allows us a scientific process

to compare approaches– e.g., server models, gigabit

adaptors, OS implementations

• Assumption is that difference in testbed translates to some difference in real-world

• Allows the performance debugging cycle

Measure Reproduce

Find Problem

Fix and/or improve

The Performance Debugging Cycle


Erich Nahum 74

Problems with Workload Generators

• Only as good as our understanding of the traffic• Traffic may change over time

– generators must too

• May not be representative– e.g., are file size distributions from IBM.com similar to

mine?

• May be ignoring important factors– e.g., browser behavior, WAN conditions, modem

connectivity

• Still, useful for diagnosing and treating problems


Erich Nahum 75

How does W. Generation Work?

• Many clients, one server– match asymmetry of Internet

• Server is populated with some kind of synthetic content

• Simulated clients produce requests for server

• Master process to control clients, aggregate results

• Goal is to measure server– not the client or network

• Must be robust to conditions– e.g., if server keeps sending 404 not

found, will clients notice?

ResponsesRequests


Erich Nahum 76

Evolution: WebStone• The original workload generator from SGI in 1995• Process based workload generator, implemented in C• Clients talk to master via sockets• Configurable: # client machines, # client processes, run

time• Measured several metrics: avg + max connect time,

response time, throughput rate (bits/sec), # pages, # files• 1.0 only does GETS, CGI support added in 2.0• Static requests, 5 different file sizes:

Percentage Size

35.00 500 B

50.00 5 KB

14.00 50 KB

0.90 500 KB

0.10 5 MBwww.mindcraft.com/webstone


Erich Nahum 77

Evolution: SPECWeb96

• Developed by SPEC– Systems Performance Evaluation Consortium– Non-profit group with many benchmarks (CPU, FS)

• Attempt to get more representative– Based on logs from NCSA, HP, Hal Computers

• 4 classes of files:

• Poisson distribution between each class

Percentage Size

35.00 0-1 KB

50.00 1-10 KB

14.00 10-100 KB

1.00 100 KB – 1 MB


Erich Nahum 78

SPECWeb96 (cont)

• Notion of scaling versus load:– number of directories in data set size doubles as

expected throughput quadruples (sqrt(throughput/5)*10)

– requests spread evenly across all application directories

• Process based WG• Clients talk to master via RPC's (less robust)• Still only does GETS, no keep-alive

www.spec.org/osg/web96


Erich Nahum 79

Evolution: SURGE

• Scalable URL Reference GEnerator– Barford & Crovella at Boston University CS Dept.

• Much more worried about representativeness, captures:– server file size distributions,– request size distribution,– relative file popularity– embedded file references– temporal locality of reference– idle periods ("think times") of users

• Process/thread based WG


Erich Nahum 80

SURGE (cont)

• Notion of “user-equivalent”:– statistical model of a user – active “off” time (between URLS),– inactive “off” time (between pages)

• Captures various levels of burstiness• Not validated, shows that load generated is

different than SpecWeb96 and has more burstiness in terms of CPU and # active connections

www.cs.wisc.edu/~pb


Erich Nahum 81

Evolution: S-client

• Almost all workload generators are closed-loop:– client submits a request, waits for server, maybe thinks for

some time, repeat as necessary

• Problem with the closed-loop approach:– client can't generate requests faster than the server can

respond– limits the generated load to the capacity of the server– in the real world, arrivals don’t depend on server state

• i.e., real users have no idea about load on the server when they click on a site, although successive clicks may have this property

– in particular, can't overload the server

• s-client tries to be open-loop:– by generating connections at a particular rate – independent of server load/capacity


Erich Nahum 82

S-Client (cont)• How is s-client open-loop?

– connecting asynchronously at a particular rate– using non-blocking connect() socket call

• Connect complete within a particular time?– if yes, continue normally.– if not, socket is closed and new connect initiated.

• Other details:– uses single-address space event-driven model like Flash– calls select() on large numbers of file descriptors– can generate large loads

• Problems:– client capacity is still limited by active FD's– “arrival” is a TCP connect, not an HTTP request

www.cs.rice.edu/CS/Systems/Web-measurement


Erich Nahum 83

Evolution: SPECWeb99

• In response to people "gaming" benchmark, now includes rules:– IP maximum segment lifetime (MSL) must be at least 60

seconds (more on this later!)– Link-layer maximum transmission unit (MTU) must not be larger

than 1460 bytes (Ethernet frame size)– Dynamic content may not be cached

• not clear that this is followed– Servers must log requests.

• W3C common log format is sufficient but not mandatory.– Resulting workload must be within 10% of target.– Error rate must be below 1%.

• Metric has changed:– now "number of simultaneous conforming connections“: rate of

a connection must be greater than 320 Kbps


Erich Nahum 84

SPECWeb99 (cont)• Directory size has changed:

(25 + (400000/122000)* simultaneous conns) / 5.0)

• Improved HTTP 1.0/1.1 support:– Keep-alive requests (client closes after N requests)– Cookies

• Back-end notion of user demographics– Used for ad rotation– Request includes user_id and last_ad

• Request breakdown:– 70.00 % static GET– 12.45 % dynamic GET– 12.60 % dynamic GET with custom ad rotation– 04.80 % dynamic POST – 00.15 % dynamic GET calling CGI code


Erich Nahum 85

SPECWeb99 (cont)• Other breakdowns:

– 30 % HTTP 1.0 with no keep-alive or persistence– 70 % HTTP 1.0 with keep-alive to "model" persistence– still has 4 classes of file size with Poisson distribution– supports Zipf popularity

• Client implementation details:– Master-client communication now uses sockets– Code includes sample Perl code for CGI– Client configurable to use threads or processes

• Much more info on setup, debugging, tuning• All results posted to web page,

– including configuration & back end code

www.spec.org/osg/web99


Erich Nahum 86

SpecWeb99 vs. File Sizes

• SpecWeb99: In the ballpark, but not very smooth


Erich Nahum 87

SpecWeb99 vs. File Size Tail

• SpecWeb99 tail isn’t as long as real logs (900 KB max)


Erich Nahum 88

SpecWeb99 vs.Transfer Sizes

• Doesn’t capture 304 (not modified) responses• Coarser distribution than real logs (i.e., not smooth)


Erich Nahum 89

Spec99 vs.Transfer Size Tails

• SpecWeb99 does OK, although tail drops off rapidly (and in fact, no file is greater than 1 MB in SpecWeb99!).


Erich Nahum 90

Spec99 vs. Resource Popularity

• SpecWeb99 seems to do a good job, although tail isn’t long enough


Erich Nahum 91

Evolution: TPC-W• Transaction Processing Council (TPC-W)

– More known for database workloads like TPC-D– Metrics include dollars/transaction (unlike SPEC)– Provides specification, not source– Meant to capture a large e-commerce site

• Models online bookstore– web serving, searching, browsing, shopping carts– online transaction processing (OLTP)– decision support (DSS)– secure purchasing (SSL), best sellers, new products– customer registration, administrative updates

• Has notion of scaling per user– 5 MB of DB tables per user– 1 KB per shopping item, 25 KB per item in static images


Erich Nahum 92

TPC-W (cont)• Remote browser emulator (RBE)

– emulates a single user– send HTTP request, parse, wait for thinking, repeat

• Metrics:– WIPS: shopping– WIPSb: browsing– WIPSo: ordering

• Setups tend to be very large:– multiple image servers, application servers, load balancer– DB back end (typically SMP)– Example: IBM 12-way SMP w/DB2, 9 PCs w/IIS: 1M $

www.tpc.org/tpcw


Erich Nahum 93

Summary: Workload Generators

• Only the beginning. Many other workload generators:– httperf from HP– WAGON from IBM– WaspClient from IBM– Others?

• Both workloads and generators change over time:– Both started simple, got more complex– As workload changes, so must generators

• No one single "good" generator– SpecWeb99 seems the favorite (2002 rumored in the works)

• Implementation issues similar to servers:– They are networked-based request producers (i.e., produce GET's instead of 200 OK's). – Implementation affects capacity planning of clients! (want to make sure clients are not bottleneck)


Erich Nahum 94

End of this tutorial…

• This is roughly half of a four-hour tutorial:– ACM SIGMETRICS 2002 (June, Marina Del Ray, CA)

• Remainder gets into more detailed issues:– Event notification mechanisms in servers– Overview of the TCP protocol– TCP dynamics for servers– TCP implementation issues for servers

• Talk to me if you’re still interested, or• Point your browser at:

www.sigmetrics.org


Erich Nahum 95

Chapter: Event Notification

• Event notification:– Mechanism for kernel and application to notify each other of

interesting/important events – E.g., connection arrivals, socket closes, data available to

read, space available for writing

• Idea is to exploit concurrency:– Concurrency in user workloads means host CPU can overlap

multiple events to maximize parallelism– Keep network, disk busy; never block

• Simultaneously, want to minimize costs:– user/kernel crossings and testing idle socket descriptors

• Event notification changes applications:– state-based to event-based– requires a change in thinking


Erich Nahum 96

Chapter: Introduction to TCP

• Layering is a common principle in network protocol design

• TCP is the major transport protocol in the Internet

• Since HTTP runs on top of TCP, much interaction between the two

• Asymmetry in client-server model puts strain on server-side TCP implementations

• Thus, major issue in web servers is TCP implementation and behavior

application

transport

network

link

physical


Erich Nahum 97

Chapter: TCP Dynamics• In this section we'll describe some of the problems

you can run into as a WWW server interacting with TCP.

• Most of these affect the response as seen by the client, not the throughput generated by the server.

• Ideally, a server developer shouldn't have to worry about this stuff, but in practice, we'll see that's not the case.

• Examples we'll look at include:– The initial window size– The delayed ACK problem– Nagle and its interaction with delayed ack– Small receive windows interfering with loss recovery


Erich Nahum 98

Chapter: Server TCP Implementation

• In this section we look at ways in which the host TCP implementation is stressed under large web server workloads. Most of these techniques deal with large numbers of connections:– Looking up arriving TCP segments with large numbers

of connections– Dealing with the TIME-WAIT state caused by closing

large number of connections– Managing large numbers of timers to support

connections– Dealing with memory consumption of connection state

• Removing data-touching operations – byte copying and checksums

Presentation (PowerPoint File)

Documents

Transcript of Presentation (PowerPoint File)