Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext...

20
1 CPSC 852 Internetworking Applications & Application-Layer Protocols: The Web & HTTP Michele Weigle Department of Computer Science Clemson University [email protected] http://www.cs.clemson.edu/~mweigle/courses/cpsc852 2 Application-Layer Protocols Outline ! The architecture of distributed systems » Client/Server computing » P2P computing » Hybrid (Client/Server and P2P) systems ! The programming model used in constructing distributed systems » Socket programming ! Example client/server systems and their application-layer protocols » The World-Wide Web (HTTP) » Reliable file transfer (FTP) » E-mail (SMTP & POP) » Internet Domain Name System (DNS) local ISP company network regional ISP application transport network link physical application

Transcript of Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext...

Page 1: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

1

CPSC 852

Internetworking

Applications &

Application-Layer Protocols:

The Web & HTTP

Michele WeigleDepartment of Computer Science

Clemson University

[email protected]

http://www.cs.clemson.edu/~mweigle/courses/cpsc852

2

Application-Layer ProtocolsOutline

! The architecture of distributed systems

» Client/Server computing

» P2P computing

» Hybrid (Client/Server and P2P) systems

! The programming model used in constructingdistributed systems

» Socket programming

! Example client/server systems andtheir application-layer protocols

» The World-Wide Web (HTTP)

» Reliable file transfer (FTP)

» E-mail (SMTP & POP)

» Internet Domain Name System (DNS)

local ISP

companynetwork

regional ISP

application

transport

network

link

physical

application

Page 2: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

3

local ISP

companynetwork

regional ISP

Applications and Application-Layer ProtocolsOverview

! Application:Communicating, distributedprocesses» Running in network hosts in

“user space”

» Exchange messages toimplement application

! Application-layer protocols» One “piece” of an application

» Defines messages exchangedand actions taken

» Uses services provided bylower layer protocols

application

transport

network

link

physical

application

application

transport

network

link

physical

application

application

transport

network

link

physical

application

4

Application-Layer ProtocolsThe Web

! User agent (client) forthe Web is called abrowser:

» MS Internet Explorer

» Mozilla Firefox

» Apple Safari

! Server for the Web iscalled a Web server:

» Apache (public domain)

» MS Internet InformationServer (IIS)

Page 3: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

5

Application-Layer ProtocolsWeb terminology

! Web page:

» Addressed by a URL

» Consists of “objects”

! Most Web pages consist of:

» Base HTML page

» Embedded objects

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html lang="en"><head> <meta http-equiv="content-type" content="text/html; charset=iso-8859-1"> <title>CNN.com</title> <meta http-equiv="refresh" content="1800; URL=http://www.cnn.com/?"> <link rel="StyleSheet" href="http://i.cnn.net/cnn/virtual/2001/style/main.css" type="text/css"> <script language="JavaScript1.1" src="http://i.cnn.net/cnn/virtual/2000/code/main.js" type="text/javascript"> </script> <script language="JavaScript1.1" type="text/javascript"> </script><script language="JavaScript1.1" src="http://ar.atwola.com/file/adsWrapper.js"></script><style type="text/css"></style><script language="JavaScript">document.adoffset=0</script></head>

<body class="cnnMainBody" bgcolor="#FFFFFF">

<a name="top_of_page"></a> : :

6

Web TerminologyURLs (Universal Resource Locators)

www.someSchool.edu:8080/someDept/pic.gif

Server domain name Object path name

Optional server port (Default = port 80)

! URL components

» Server address

» (Optional port number)

» Path name

Page 4: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

7

Web TerminologyThe Hypertext Transfer Protocol (HTTP)

! Web’s application layerprotocol

! Client/server model

» client: browser thatrequests, receives,“displays” Web objects

» server: Web server sendsobjects in response torequests

PC running

Firefox

Server

running

Apache

Mac running

Safari

HTTP request

HTTP request

HTTP response

HTTP response

! HTTP/1.0: RFC 1945

! HTTP/1.1: RFC 2616

8

The Hypertext Transfer Protocol HTTP Overview

! HTTP uses TCP sockets» Browser initiates TCP

connection to server (on port 80)

! HTTP messages (application -layer protocol messages)exchanged between browserand Web server

! HTTP/1.0: RFC 1945» One request/response

interaction per connection

! HTTP/1.1: RFC 2616» Persistent connections

» Pipelined connections

! HTTP is “stateless”» Server maintains no

information aboutpast browser requests

! Protocols that maintain “state”are complex!

» Past history (state) must bemaintained

» If server or client crashes,their views of “state” maybe inconsistent and mustbe reconciled

aside

Page 5: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

9

The Hypertext Transfer ProtocolHTTP example

! User enters URL www.someSchool.edu/someDept/home.index

» Referenced object contains HTML text and references10 JPEG images

! Browser sends an HTML “GET” request to the serverwww.someSchool.edu

Web

Server

Browser

HTTP request1

HTTP response1

! Server will retrieve andsend the HTML file

! Browser will read the fileand sequentially make 10separate requests for theembedded JPEG images

HTTP request11

HTTP response11

...

10

HTTP 1.0 ExampleURL www.someschool.edu/someDept/home.index

1) Browser initiates TCP connection toserver at www.someSchool.edu.Port 80 is “well known” for server

2) Server “accepts” connection3) Client writes an HTTP GETrequest message (containing path)to TCP connection socket

time

TCP 3-way handshake

5) Server closes TCP connection

4) Server reads request message, formsresponse message containingrequested object, writes message tosocket

Client Server

0) Server process at hostwww.someSchool.edu waitingfor TCP connections on port 80

Page 6: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

11

6) Browser reads response messagecontaining the HTML file.

Ten references to JPEG objects arefound during the HTML parse

The above steps are repeated for

each of the 10 JPEG objects

7) Browser initiates TCP connection toserver at www.someSchool.edu

8) Server “accepts” connection

TCP 3-way handshake

HTTP 1.0 ExampleURL www.someschool.edu/someDept/home.index

time

Client Server

12

The Hypertext Transfer ProtocolHTTP message format

! Two types of HTTP message formats: request andresponse messages

» ASCII (human-readable format)

! HTTP request message:

method <SP> path <SP> version <CR><LF>

header field name “:” value <CR><LF>

header field name “:” value <CR><LF>

<CR><LF>

entity body

» Request line

» Optionalheader lines

» Present onlyfor somemethods(e.g., POST)

Page 7: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

13

HTTP Message FormatNetscape Navigator & MS Explorer request examples

! How does Netscape process:

http://www.cs.clemson.edu:8080/~mweigle/ ?

GET /~mweigle/ HTTP/1.0Connection: Keep-AliveUser-Agent: Mozilla/4.74 [en] (WinNT; U)Host: www.cs.clemson.edu:8080Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*Accept-Encoding: gzipAccept-Language: enAccept-Charset: iso-8859-1,*,utf-8Cookie: SITESERVER=ID=8a064b7855a043146e45991174a3d970

14

HTTP Message FormatNetscape Navigator & MS Explorer request examples

GET /~mweigle/ HTTP/1.0Connection: Keep-AliveUser-Agent: Mozilla/4.74 [en] (WinNT; U)Host: www.cs.clemson.edu:8080Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*Accept-Encoding: gzipAccept-Language: enAccept-Charset: iso-8859-1,*,utf-8Cookie: SITESERVER=ID=8a064b7855a043146e45991174a3d970

GET /~mweigle/ HTTP/1.1Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/msword, application/vnd.ms-excel, application/vnd.ms-powerpoint, */*Accept-Language: en-usAccept-Encoding: gzip, deflateUser-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0)Host: www.cs.clemson.edu:8080Connection: Keep-Alive

Page 8: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

15

HTTP Message FormatGeneral response message format

version <SP> code <SP> phrase <CR><LF>

header field name “:” value <CR><LF>

header field name “:” value <CR><LF>

<CR><LF>

entity body

! Response messages

» ASCII (human-readable format)

! Message structure:

» Status line

» Optionalheader lines

» Requestedobject, errormessagemessage, etc.

16

HTTP Message FormatTelnet example

> telnet www.cs.clemson.edu 80Trying 130.127.48.92...Connected to www.cs.clemson.edu.Escape character is '^]'.GET /~mweigle/foo.txt HTTP/1.0

HTTP/1.1 200 OKDate: Mon, 06 Sep 2004 19:22:18 GMTServer: Apache/1.3.31 (Unix)Last-Modified: Mon, 30 Aug 2004 20:35:29 GMTETag: "2fa6-76c-41338f91"Accept-Ranges: bytesContent-Length: 95Connection: closeContent-Type: text/plain

** This test file is stored in the UNIX** file system at** /home/mweigle/public_html/foo.txtConnection closed by foreign host.

Connect to HTTPserver port

Telnet output

Type GET commandplus blank line

HTTP responsestatus line

HTTPresponseheaders plusblank line

Object content

Telnet output

Page 9: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

17

HTTP Message FormatTelnet example (2)

> telnet www.msn.com 80Trying 207.46.179.134...Connected to www.msn.com.Escape character is '^]'.GET /~index.html HTTP/1.0

HTTP/1.1 404 Object Not FoundServer: Microsoft-IIS/5.0Date: Mon, 11 Feb 2002 18:33:15 GMTContent-Length: 1638Content-Type: text/html

<HTML> <HEAD> . . .. . . . Error type 404 - Object Not Found </body> </html>Connection closed by foreign host.

Connect to HTTPserver port

Telnet output

Type GET commandplus blank line

HTTP responsestatus line

HTTPresponseheaders plusblank line

Object content

Telnet output

18

HTTP Message FormatHTTP response status codes

200 OK

» Request succeeded, requested object later in this message

301 Moved Permanently

» Requested object moved, new location specified later inthis message (Location:)

400 Bad Request

» Request message not understood by server

404 Not Found

» Requested document not found on this server

505 HTTP Version Not Supported

! Sample response codes:

Page 10: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

19

HTTP Message FormatTypical Request and Response Headers

Connection: Keep-AliveUser-Agent: Mozilla/4.74 [en] (WinNT; U)Host: www.cs.clemson.edu:8080Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*Accept-Encoding: gzipAccept-Language: enAccept-Charset: iso-8859-1,*,utf-8Cookie: SITESERVER=ID=8a064b785a043146e4599174a3d970

Request

headers

Response

headers

Date: Fri, 02 Feb 2001 19:10:11 GMTServer: Apache/1.3.9 (Unix) (Red Hat/Linux)Last-Modified: Tue, 30 Jan 2001 21:48:14 GMTETag: "1807135e-67-3a77369e"Accept-Ranges: bytesContent-Length: 103Connection: closeContent-Type: text/plain

20

HTTP Protocol DesignNon-persistent connections

! The default browser/server behavior in HTTP/1.0 isfor the connection to be closed after the completion ofthe request

» Server parses request, responds, and closes TCP connection

» The Connection: keep-alive header allows forpersistent connections

Web

Server

Browser

TCP connection

establishment! With non-persistentconnections at least 2 RTTs arerequired to fetch every object

» 1 RTT for TCP handshake

» 1 RTT for request/response

HTTP request

HTTP response

Page 11: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

21

Non-Persistent ConnectionsPerformance

A

B

propagation

transmission

nodalprocessing

queueing

Web

Server

Browser

TCP connection

establishment

HTTP request

HTTP response

! With non-persistentconnections at least 2 RTTs arerequired to fetch every object

» 1 RTT for TCP handshake

» 1 RTT for request/response

22

Non-Persistent ConnectionsPerformance

A

B

propagation

transmission

nodalprocessing

queueing

! Example: A 1 Kbyte base page with five 1.5 Kbyteembedded images coming from the West coast on anOC-48 link» 1 RTT for TCP handshake = 50 ms

» 1 RTT for request/response = 50 ms

! Page download time with non-persistent connections?

! Page download time with a persistent connection?

Page 12: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

23

Non-Persistent ConnectionsParallel connections

! To improve performance a browser can issue multiplerequests in parallel to a server (or servers)» Server parses request, responds, and closes TCP connection

Web

Server

Browser

TCP connection

establishment

HTTP request

HTTP response

Web

Server

TCP connection

establishmentHTTP request

HTTP response

! Page download time with parallel connections?» 2 parallel connections =

» 4 parallel connections =

24

HTTP Protocol DesignPersistent v. non-persistent connections

! Non-persistent

» HTTP/1.0

» Server parses request, responds, and closes TCP connection

» At least 2 RTTs to fetch every object

! Persistent

» Default for HTTP/1.1 (negotiable in 1.0)

» Client sends requests for multiple objects on one TCP connection

» Server, parses request, responds, parses next request, responds...

» Fewer RTTs

! Parallel v. persistent connections?

Page 13: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

25

Persistent ConnectionsPersistent connections with pipelining

Persistent without pipelining:

! Client issues new request only when previous

response has been received

! One RTT for each referenced object

Persistent with pipelining:

! Default in HTTP/1.1

! Client sends requests as soon as it encounters a

referenced object

! As little as one RTT for all the referenced objects

26

Persistent ConnectionsWithout Pipelining

HTTP request msg

base HTTP response msg

HTTP request msg

(1st embedded object)

HTTP response msg

(1st embedded object)

HTTP request msg

(2nd embedded object)

HTTP response msg

(2nd embedded object)

Client Server

Time

! Client issues new

request only when

previous response has

been received

! One RTT for each

referenced object

Page 14: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

27

Persistent ConnectionsWith Pipelining

HTTP request msg

base HTTP response msg

HTTP request msg

(1st embedded object)

HTTP response msg

(1st embedded object)

HTTP request msg

(2nd embedded object)

HTTP response msg

(2nd embedded object)

Client Server

Time

! Default in HTTP/1.1

! Client sends requests

as soon as it

encounters a

referenced object

! As little as one RTT

for all the referenced

objects

28

HTTP User-Server InteractionAuthentication

! Problem: How to limitaccess to server documents?» Servers provide a means to

require users to authenticate themselves

! HTTP includes a header tagfor user to specify name andpassword (on a GET request)» If no authorization presented,

server refuses access, sendsWWW authenticate:header line in response

! Stateless: client must sendauthorization for each request» A stateless design

» (But browser may cache credentials)

usual HTTP request msg

401: authorization

WWW authenticate:

usual HTTP request msg

+ authorization:

usual HTTP response msg

usual HTTP request msg

+ authorization:

usual HTTP response msg

Client Server

Time

Page 15: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

29

HTTP User-Server InteractionCookies

! Server sends “cookie”to browser in responsemessageSet-cookie:<value>

! Browser presents cookie inlater requests to same servercookie: <value>

! Server matches cookie withserver-stored information» Provides authentication

» Client-side state main-tenance(remembering userpreferences, previous choices,…)

usual HTTP request msg

usual HTTP response +Set-cookie: S1

usual HTTP request msg

cookie: S1

usual HTTP request msg

cookie: S1

cookie-

specific

action

cookie-

specific

action

usual HTTP response msg

usual HTTP response +Set-cookie: S2

Client Server

30

HTTP User-Server InteractionBrowser caches

Internet

browserorigin server

miss

origin serverBrowser withdisk cache

Internethit

! Browsers cache content from servers to avoid futureserver interactions to retrieve the same content

Page 16: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

31

HTTP User-Server InteractionThe conditional GET

! If object in browser cacheis “fresh,” the server won’tre-send it» Browsers save current date

along with object in cache

! Client specifies the date ofcached copy in HTTPrequest

If-modified-since:<date>

! Server’s response containsthe object only if it hasbeen changed since thecached date

! Otherwise server returns:

HTTP/1.0 304 Not Modified

HTTP requestIf-modified-since:

<date>

HTTP responseHTTP/1.0

304 Not Modified

object

not

modified

HTTP requestIf-modified-since:

<date>

HTTP responseHTTP/1.0 200 OK

<data>

object

modified

Client Server

32

HTTP User-Server InteractionCache Performance for HTTP Requests

! What is the average time to retrieve a web object?

» Tmean = hit ratio x Tcache + (1 – hit ratio) x Tserver

where hit ratio is the fraction of objects found in the cache

» Mean access time from a disk cache =

» Mean access time from the origin server =

! For a 60% hit ratio, the mean client access time is:

» (0.6 x 10 ms) + (0.4 x 1,000 ms) = 406 ms

Origin ServerBrowser with

disk cache

CacheMiss

Cache HitNetwork

Page 17: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

33

Cache Performance for HTTP RequestsWhat determines the hit ratio?

! Cache size

! Locality of references

» How often the same web object is requested

! How long objects remain “fresh” (unchanged)

! Object references that can’t be cached at all

» Dynamically generated content

» Protected content

» Content purchased for each use

» Content that must always be up-to-date

» Advertisements (“pay-per-click” issues)

34

The Impact of Web Traffic on the InternetMCI backbone traffic in bytes by protocol (1998)

Page 18: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

35

Traffic Makeup on UNC LinkInbound traffic

! Port 412 = file sharing

! Port 7668, 6349 = You tell me!!

36

Caching on the WebWeb caches (Proxy servers)

! Users configure browsers tosend all requests through ashared proxy server» Proxy server is a large

cache of web objects

! Web caches are used to satisfy client requests withoutcontacting the origin server

HTTP request

HTTP requestHTTP response (hit)

HTTP response

HTTP response

clientProxy

server

client

Origin

server

Open research question:How does the proxy hit ratiochange with the populationof users sharing it?

HTTP request (m

iss)

! Browsers send all HTTPrequests to proxy» If object in cache, proxy

returns object in HTTPresponse

» Else proxy requests objectfrom origin server, thenreturns it in HTTP responseto browser

HTTP response

Page 19: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

37

Why do Proxy Caching?The performance implications of caching

! Consider a cache that is“close” to client

» E.g., on the same LAN

! Nearby caches mean:

» Smaller response times

» Decreased traffic on egresslink to institutional ISP(often the primarybottleneck)

To improve Web response times

should one buy a 10 Mbps

access link or a proxy server?

originservers

campusnetwork

1.5 Mbpsaccess link

10 Mbps LAN

publicInternet

proxyserver

38

Why do Proxy Caching?The performance implications of caching

! Web performance without caching:

» Mean object size = 50 Kbits

» Mean request rate = 29/sec

» Mean origin server access time = 1sec

originservers

campusnetwork

1.5 Mbpsaccess link

10 Mbps LAN

reqs

sec29

50 Kbits/req

1.5 MbpsX = 0.97

» Average response time = ??

! Traffic intensity on the accesslink:

publicInternet

1000

ms

Page 20: Application-Layer Protocolsmweigle/clemson/courses/cpsc852-f05/lectures/2-2-HTTP.pdfThe Hypertext Transfer Protocol (HTTP)!WebÕs application layer protocol!Client/server model Èclient:

39

Why do Proxy Caching?The performance implications of caching

! Upgrade the access link to 10 Mb/s

» Response time = ??

» Queuing is negligible hence response time =1 sec

! Add a proxy cache with 40% hit ratioand 10 ms access time

» Response time = ??

» Traffic intensity on access link =

originservers

campusnetwork

1.5 Mbpsaccess link

10 Mbps LAN

0.4 x 10 ms + 0.6 x 1,000 ms = 604 ms

0.6 x 0.97 = 0.58

» Response time =

! A proxy cache lowers response time,lowers access link utilization, and saves money!

publicInternet

1000

ms

40

Why do Proxy Caching?The case for proxy caching

! Lower latency for user’s webrequests

! Reduced traffic at all network levels

! Reduced load on servers

! Some level of fault tolerance(network, servers)

! Reduced costs to ISPs, contentproviders, etc., as web usagecontinues to grow exponentially

! More rapid distribution of content

originservers

campusnetwork

1.5 Mbpsaccess link

10 Mbps LAN

proxyserver

publicInternet