HTTP WEB Risanuri Hidayat, Ir., M.Sc.. World Wide Web T. Berners-Lee, R. Fielding, H. Frystyk:...
-
date post
19-Dec-2015 -
Category
Documents
-
view
220 -
download
0
Transcript of HTTP WEB Risanuri Hidayat, Ir., M.Sc.. World Wide Web T. Berners-Lee, R. Fielding, H. Frystyk:...
HTTPHTTPWEBWEB
Risanuri Hidayat, Ir., M.Sc.Risanuri Hidayat, Ir., M.Sc.
World Wide WebWorld Wide Web
T. Berners-Lee, R. Fielding, H. Frystyk: “Hypertext T. Berners-Lee, R. Fielding, H. Frystyk: “Hypertext Transfer Protocol - HTTP/1.0”, RFC 1945, 1996.Transfer Protocol - HTTP/1.0”, RFC 1945, 1996.
Naming scheme for resourcesNaming scheme for resourcesURL, URN, URI URL, URN, URI
Multimedia documentsMultimedia documentsMIME encoding (RFC)MIME encoding (RFC)
Transfer protocolTransfer protocolHTTP/1.0, HTTP/1.1HTTP/1.0, HTTP/1.1
Implemented over TCP/IPImplemented over TCP/IP Integrated with Internet infrastructureIntegrated with Internet infrastructure
DNS, SMTPDNS, SMTP
SejarahSejarah
Hypertext systems:Hypertext systems: no network access protocolno network access protocol
Gopher, WAISGopher, WAIS no hyperlinksno hyperlinks
WWW @ CERN (Tim Berners-Lee, 1990)WWW @ CERN (Tim Berners-Lee, 1990)
HTTP/0.9 (1992)HTTP/0.9 (1992)
Aplikasi InternetAplikasi Internet
Application
e-mailremote terminal access
Web file transfer
streaming multimedia
remote file serverInternet telephony
Applicationlayer protocol
smtp [RFC 821]telnet [RFC 854]http [RFC 2068]ftp [RFC 959]proprietary(e.g. RealNetworks)NSFproprietary(e.g., Vocaltec)
Underlyingtransport protocol
TCPTCPTCPTCPTCP or UDP
TCP or UDPtypically UDP
What is HTTPWhat is HTTP
HTTP stands for HTTP stands for Hypertext Transfer ProtocolHypertext Transfer Protocol. It's the . It's the network protocol used to deliver virtually all files and network protocol used to deliver virtually all files and other data (collectively called other data (collectively called resourcesresources) on the World ) on the World Wide Web, whether they're HTML files, image files, Wide Web, whether they're HTML files, image files, query results, or anything else. Usually, HTTP takes query results, or anything else. Usually, HTTP takes place through TCP/IP sockets (and this tutorial ignores place through TCP/IP sockets (and this tutorial ignores other possibilities). other possibilities). A browser is an A browser is an HTTP clientHTTP client because it sends requests because it sends requests to an to an HTTP serverHTTP server (Web server), which then sends (Web server), which then sends responses back to the client. The standard (and default) responses back to the client. The standard (and default) port for HTTP servers to listen on is 80, though they can port for HTTP servers to listen on is 80, though they can use any port.use any port.HTTP is used to transmit HTTP is used to transmit resourcesresources, not just files. A , not just files. A resource is some chunk of information that can be resource is some chunk of information that can be identified by a URL identified by a URL
HTTPHTTP
GET //www.dcs.qmw.ac.uk/index.html HTTP/ 1.1
URL or pathnamemethod HTTP version headers message body
HTTP/1.1 200 OK resource data
HTTP version status code reason headers message body
•Resource := MIME-encoded data•Content negotiation•Authentication
Methods:
•GET, HEAD, POST
•PUT, DELETE, TRACE, OPTIONS, CONNECT
URLURL
http://www.cdk3.net:8888/WebExamples/earth.html
URL
Resource ID (IP number, port number, pathname)
Network address
2:60:8c:2:b0:5a file
Web server
55.55.55.55 WebExamples/earth.html8888
DNS lookup
Socket
HTTP TransactionsHTTP Transactions
HTTP uses the client-server model: HTTP uses the client-server model: An An HTTP clientHTTP client opens a connection and sends a opens a connection and sends a
request messagerequest message to an to an HTTP serverHTTP server; ; the server then returns a the server then returns a response messageresponse message, usually , usually
containing the resource that was requested. containing the resource that was requested.
After delivering the response, the server closes After delivering the response, the server closes the connection (making HTTP a the connection (making HTTP a statelessstateless protocol, i.e. not maintaining any connection protocol, i.e. not maintaining any connection information between transactions). information between transactions).
HTTP ProtocolHTTP Protocol
http: hypertext transfer http: hypertext transfer protocolprotocolWWW’s application WWW’s application layer protocollayer protocolclient/server modelclient/server model
client:client: browser that browser that requests, receives, requests, receives, “displays” WWW objects“displays” WWW objects
server:server: WWW server WWW server sends objects in sends objects in response to requestsresponse to requests
http1.0: RFC 1945http1.0: RFC 1945http1.1: RFC 2068http1.1: RFC 2068
PC runningExplorer
Server running
Apache Webserver
SUN runningNetscape Navigator
http request
http re
quest
http response
http re
sponse
HTTP ProtocolHTTP Protocolhttp: TCP transport http: TCP transport service:service:client initiates TCP client initiates TCP connection (creates connection (creates socket) to server, port 80socket) to server, port 80server accepts TCP server accepts TCP connection from clientconnection from clienthttp messages http messages (application-layer (application-layer protocol messages) protocol messages) exchanged between exchanged between browser (http client) and browser (http client) and WWW server (http WWW server (http server)server)TCP connection closedTCP connection closed
http is “stateless”http is “stateless”server maintains no server maintains no information about past information about past client requestsclient requests
Protocols that maintain “state” Protocols that maintain “state” are complex!are complex!past history (state) must be past history (state) must be maintainedmaintainedif server/client crashes, their if server/client crashes, their views of “state” may be views of “state” may be inconsistent, must be inconsistent, must be reconciledreconciled
HTTP ProtocolHTTP Protocol
The format of the request and response The format of the request and response messages are similar, and English-messages are similar, and English-oriented. Both kinds of messages consist oriented. Both kinds of messages consist of: of: an initial line, an initial line, zero or more header lines, zero or more header lines, a blank line (i.e. a CRLF by itself), and a blank line (i.e. a CRLF by itself), and an optional message body (e.g. a file, or an optional message body (e.g. a file, or
query data, or query output). query data, or query output).
RequestRequest
Initial Request LineInitial Request Line A request line has three parts, separated by spaces: a A request line has three parts, separated by spaces: a methodmethod name, the name, the local pathlocal path of the requested of the requested resource, and the resource, and the version of HTTPversion of HTTP being used. being used.
A typical request line is: A typical request line is: GET /path/to/file/index.html HTTP/1.0GET /path/to/file/index.html HTTP/1.0
GETGET is the most common HTTP method; it says "give me this is the most common HTTP method; it says "give me this resource". Other methods include resource". Other methods include POSTPOST and and HEADHEAD-- more on -- more on those those laterlater. Method names are always uppercase. . Method names are always uppercase.
The path is the part of the URL after the host name, also called The path is the part of the URL after the host name, also called the the request URIrequest URI (a URI is like a URL, but more general). (a URI is like a URL, but more general).
The HTTP version always takes the form "The HTTP version always takes the form "HTTP/x.xHTTP/x.x", ", uppercaseuppercase
HTTP Request Header FormatHTTP Request Header Format
Two types of messages: Two types of messages: requestrequest, , responseresponse
http request message:http request message: ASCII (human-readable format)ASCII (human-readable format)
GET /somedir/page.html HTTP/1.1 Connection: close User-agent: Mozilla/4.0 Accept: text/html, image/gif,image/jpeg Accept-language:en
(extra carriage return, line feed)
request line(GET, POST,
HEAD commands)
header lines
Carriage return, line feed
indicates end of message
HTTP Request Header FormatHTTP Request Header Format
Response/ReplyResponse/Reply
Initial Response Line (Status Line). Initial Response Line (Status Line). The initial The initial response line, called the response line, called the status linestatus line, also has , also has three parts separated by spaces: three parts separated by spaces: the HTTP version, the HTTP version, a a response status coderesponse status code that gives the result of the that gives the result of the
request, and request, and an English an English reason phrasereason phrase describing the status code. describing the status code.
Typical status lines are: Typical status lines are: HTTP/1.0 200 OK or HTTP/1.0 200 OK or HTTP/1.0 404 Not Found Notes: HTTP/1.0 404 Not Found Notes:
HTTP Reply Header FormatHTTP Reply Header Format
HTTP/1.1 200 OK Connection: close Date: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998 …... Content-Length: 6821 Content-Type: text/html data data data data data ...
status line(protocol
status codestatus phrase)
header lines
data, e.g., requestedhtml file
HTTP Reply Status CodeHTTP Reply Status Code
200 OK200 OK request succeeded, requested object later in this request succeeded, requested object later in this
messagemessage
301 Moved Permanently301 Moved Permanently requested object moved, new location specified later requested object moved, new location specified later
in this message (Location:)in this message (Location:)
400 Bad Request400 Bad Request request message not understood by serverrequest message not understood by server
404 Not Found404 Not Found requested document not found on this serverrequested document not found on this server
505 HTTP Version Not Supported505 HTTP Version Not Supported
Sample HTTP ExchangeSample HTTP Exchange
To retrieve the file at the URL To retrieve the file at the URL http://www.somehost.com/path/file.html first open a http://www.somehost.com/path/file.html first open a socket to the host socket to the host www.somehost.comwww.somehost.com, port 80 (use , port 80 (use the default port of 80 because none is specified in the the default port of 80 because none is specified in the URL). Then, send something like the following through URL). Then, send something like the following through the socket: the socket:
GET /path/file.html HTTP/1.0 GET /path/file.html HTTP/1.0 From: [email protected] From: [email protected] User-Agent: HTTPTool/1.0 User-Agent: HTTPTool/1.0 [blank line here] [blank line here]
Sample HTTP ExchangeSample HTTP Exchange
The server should respond with something like the following, sent The server should respond with something like the following, sent back through the same socket: back through the same socket:
HTTP/1.0 200 OK HTTP/1.0 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Type: text/html Content-Length: 1354 Content-Length: 1354 <html> <html> <body> <body> <h1>Happy New Millennium!</h1> <h1>Happy New Millennium!</h1> (more file contents) . . . (more file contents) . . . </body> </body> </html> </html>
After sending the response, the server closes the socket. After sending the response, the server closes the socket.
User-server interaction: authenticationUser-server interaction: authenticationAuthentication goal:Authentication goal: control control
access to server documentsaccess to server documentsstateless:stateless: client must client must present authorization in present authorization in each requesteach requestauthorization: typically authorization: typically name, passwordname, password
authorization:authorization: header header line in requestline in request
if no authorization if no authorization presented, server refuses presented, server refuses access, sends access, sends a WWW a WWW authenticate:authenticate:
header line in responseheader line in response
client server
usual http request msg401: authorization req.
WWW authenticate:
usual http request msg
+ Authorization:lineusual http response
msg
usual http request msg
+ Authorization:lineusual http response
msg
time
User-server interaction: cookiesUser-server interaction: cookies
Server sends “cookie” Server sends “cookie” to client in responseto client in responseSet-cookie: #Set-cookie: #
Client present cookie Client present cookie in later requestsin later requestscookie: #cookie: #
Server matches Server matches presented-cookie with presented-cookie with server-stored cookiesserver-stored cookies authenticationauthentication remembering user remembering user
preferences, preferences, previous choicesprevious choices
client server
usual http request msgusual http response
+Set-cookie: #
usual http request msg
cookie: #usual http response
msg
usual http request msg
cookie: #usual http response msg
cookie-spectificaction
cookie-spectificaction
User-server interaction: conditional GETUser-server interaction: conditional GET
Goal:Goal: don’t send object if don’t send object if client has up-to-date stored client has up-to-date stored (cached) version(cached) version
client: specify date of client: specify date of cached copy in http requestcached copy in http requestIf-modified-since: If-modified-since: <date><date>
server: response contains server: response contains no object if cached copy no object if cached copy up-to-date: up-to-date: HTTP/1.0 304 Not HTTP/1.0 304 Not ModifiedModified
client server
http request msgIf-modified-since:
<date>
http responseHTTP/1.0
304 Not Modified
object not
modified
http request msgIf-modified-since:
<date>
http responseHTTP/1.1 200 OK
…
<data>
object modified
Message format: multimedia extensionsMessage format: multimedia extensions
MIME: multimedia mail extension, RFC 2045, 2056MIME: multimedia mail extension, RFC 2045, 2056
additional lines in msg header declare MIME content typeadditional lines in msg header declare MIME content type
From: [email protected] To: [email protected] Subject: Picture of yummy crepe. MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Type: image/jpeg
base64 encoded data ..... ......................... ......base64 encoded data .
multimedia datatype, subtype,
parameter declaration
method usedto encode data
MIME version
encoded data
MIME typesMIME typesTextText
example subtypes: example subtypes: plain, htmlplain, html
ImageImageexample subtypes: example subtypes: jpeg, gifjpeg, gif
AudioAudioexampe subtypes: exampe subtypes: basicbasic (8-bit mu-law (8-bit mu-law encoded), encoded), 32kadpcm 32kadpcm (32 kbps coding)(32 kbps coding)
VideoVideoexample subtypes: example subtypes: mpeg, quicktimempeg, quicktime
ApplicationApplicationother data that must be other data that must be processed by reader processed by reader before “viewable”before “viewable”
example subtypes: example subtypes: msword, octet-msword, octet-stream stream
HTTP Headers (samples)HTTP Headers (samples)User-AgentUser-Agent Mozilla/4.0Mozilla/4.0
Accepts: (client-side)Accepts: (client-side) text/html, image/*text/html, image/*
Content-type: (server-side)Content-type: (server-side) text/htmltext/html
Expires, Last-Modified, If-Modified-SinceExpires, Last-Modified, If-Modified-Since absoluteabsolute time stamps (1-sec resolution) time stamps (1-sec resolution) Eg: Thu, 03 Jun 1999 20:16:34 GMT=Eg: Thu, 03 Jun 1999 20:16:34 GMT=
Accept-Language, Accept-CharsetAccept-Language, Accept-CharsetContent-encodingContent-encoding
Mean #bytes per header:
300 (requests), 160 (responses)
* Require parsing !
HTTP/1.1 ImprovementsHTTP/1.1 ImprovementsB/W optimizationB/W optimization persistent connectionspersistent connections pipelining pipelining
does not block waiting for previous responsesdoes not block waiting for previous responses
end-of-message mechanismend-of-message mechanism Content-rangeContent-range
access only specified “range” of a resourceaccess only specified “range” of a resource
Explicit cache control (Cache-control)Explicit cache control (Cache-control)
Digest authentication (Content-MD5)Digest authentication (Content-MD5)
Web Caches (proxy server)Web Caches (proxy server)
User sets browser: User sets browser: WWW accesses via WWW accesses via web cacheweb cacheclient sends all http client sends all http requests to web cacherequests to web cache
if object at web cache, if object at web cache, web cache web cache immediately returns immediately returns object in http response object in http response
else requests object else requests object from origin server, from origin server, then returns http then returns http response to clientresponse to client
Goal:Goal: satisfy client request without involving origin server satisfy client request without involving origin server
client
Proxyserver
client
http request
http re
quest
http response
http re
sponse
http re
quest
http re
sponse
http requesthttp response
origin server
origin server
Why WWW Caching?Why WWW Caching?
Assume:Assume: cache is cache is “close” to client (e.g., “close” to client (e.g., in same network)in same network)
smaller response smaller response time: cache “closer” time: cache “closer” to clientto client
decrease traffic to decrease traffic to distant serversdistant servers link out of link out of
institutional/local ISP institutional/local ISP network often network often bottleneck bottleneck
originservers
public Internet
institutionalnetwork
10 Mbps LAN
1.5 Mbps access link
institutionalcache
Web caching (in)effectivenessWeb caching (in)effectiveness
Observed hit ratios below 50%Observed hit ratios below 50% even lower byte-weighted ratios !even lower byte-weighted ratios !
Possible remedies ?Possible remedies ? Prefetching Prefetching Delta-encoding Delta-encoding HTML macrosHTML macros Duplicate suppression (digest-based)Duplicate suppression (digest-based)
HTTP status & perspectiveHTTP status & perspectiveJ. C. Mogul, “What’s wrong with HTTP (and J. C. Mogul, “What’s wrong with HTTP (and why it doesn’t matter)”, Proc. USENIX why it doesn’t matter)”, Proc. USENIX Technical Conference, 1999Technical Conference, 1999 Definitely not optimalDefinitely not optimal Probably adequateProbably adequate
It works well enoughIt works well enough
It’s not the only game in townIt’s not the only game in town Two-way initiation of operationsTwo-way initiation of operations Real-timeReal-time Deferred deliveryDeferred delivery
Revising it again would be too hardRevising it again would be too hard HTTP/1.0 -> HTTP/1.1 evolution took 4+ years !HTTP/1.0 -> HTTP/1.1 evolution took 4+ years !