Hyper Text Transfer Protocol

78
Hyper Text Transfer Protocol (HTTP)

description

Hyper Text Transfer Protocol. (HTTP). HTTP. HTTP defines how Web pages are requested and served on the Internet Early servers and browsers used an ad-hoc approach A standardized protocol, called HTTP/1.0, was derived from this The earlier approach is now called HTTP/0.9 - PowerPoint PPT Presentation

Transcript of Hyper Text Transfer Protocol

Page 1: Hyper Text Transfer Protocol

Hyper Text Transfer Protocol

(HTTP)

Page 2: Hyper Text Transfer Protocol

HTTP• HTTP defines how Web pages are requested and served on

the Internet

• Early servers and browsers used an ad-hoc approach

• A standardized protocol, called HTTP/1.0, was derived from this

• The earlier approach is now called HTTP/0.9

• Later, HTTP/1.0 was extended to HTTP/1.1

• The protocol versions are upwardly compatible

– servers and browsers which can handle HTTP/1.1 can also handle HTTP/1.0 and HTTP/0.9

Page 3: Hyper Text Transfer Protocol

History: “HTTP/0.9”

• HTTP/0.9 was very simple:

– A browser would send a request like this to a server:

GET /hobbies.html

– In response, the server would send the contents of the requested file.

– Only GET requests were supported

– Only a file path and name could appear in a GET request

– The response had to be a HTML document.

Page 4: Hyper Text Transfer Protocol

Got here on 14/jan/2003

Page 5: Hyper Text Transfer Protocol

History (contd.)

• Different browsers/servers soon extended this basic scheme in various ways

• To achieve some standardization, the HTTP/1.0 protocol was specified, in 1996, in a document called RFC1945– (for historical reasons, an Internet standard spec is called a

Request for Comment or RFC)

• This was soon extended to HTTP/1.1, in RFC2068, released in January 1997

• An update to RFC2068 was produced in June 1999, as RFC2616

• Various other protocols, based on HTTP, have been produced from time-to-time– we will see a “cookie” protocol, based on HTTP, which was

specified in February 1997, in RFC2109

Page 6: Hyper Text Transfer Protocol

How HTTP Works• HTTP sits on TCP, which, in turn, sits on IP

• Usually, HTTP servers are configured to listen to TCP/IP Port 80

– although sometimes a different port is used,

– particularly if two HTTP servers are running on one machine

• You can see how HTTP works by pretending to be a browser yourself

• Using telnet to connect to a server, you can issue a request and see the response

Page 7: Hyper Text Transfer Protocol

Example• If you were to point a browser at the URL

http://student.cs.ucc.ie

you would get a HTML home-page which provides links to various pages for students, etc.

• The server on student.cs.ucc.ie uses the standard HTTP port, Port 80, so you can get the same page by

– telnetting to Port 80 on student.cs.ucc.ie

– and typing a GET request

Page 8: Hyper Text Transfer Protocol

Connecting to the HTTP server on student.cs.ucc.ie

• On any machine, say interzone, specify the address and port in a telnet command:

interzone.ucc.ie> telnet student.cs.ucc.ie 80

• You will get the following response:

Trying 143.239.211.125...

Connected to student.cs.ucc.ie.

Escape character is '^]'.

• The HTTP server is now listening

Page 9: Hyper Text Transfer Protocol

Requesting the home page• Issue the following HTTP/1.0 request, noting that you

must type two carriage returns:GET / HTTP/1.0 [RETURN]

[RETURN]

• The response consists of

– a status line,

– a sequence of headers and

– the requested home page

• Then you are told that the telnet connection was closed by the server,

as you will see on the next slide

Page 10: Hyper Text Transfer Protocol

Cs 607 got here on 14 dec 2004

Page 11: Hyper Text Transfer Protocol

The reply to your request:

• The server’s response:HTTP/1.1 200 OK

...

Content-Type: text/html

<HTML>

...

</HTML>

• Then your local telnet program tells you that the connection was closed by the server:

Connection closed by foreign host.

interzone.ucc.ie>

Page 12: Hyper Text Transfer Protocol

Getting a different page:

• Consider the page whose URL is

http://student.cs.ucc.ie/cs1064/jabowen/• Telnet to the server:

interzone.ucc.ie> telnet student.cs.ucc.ie 80

• When the server is listening, ask for the page like this:GET /cs1064/jabowen/ HTTP/1.0 [RETURN]

[RETURN]

Page 13: Hyper Text Transfer Protocol

What was going on above:

• Once connected to a HTTP server, we can

– send a HTTP request line,

– optionally followed by request headers.

• In the cases above,

GET / HTTP/1.0

and

GET /cs1064/jabowen/ HTTP/1.0

were request lines

• Each request line was terminated by pressing [RETURN]

• In each case, the second [RETURN] marked the end of an empty list of request headers

Page 14: Hyper Text Transfer Protocol

GET requests

• In GET / HTTP/1.0 – the / is the resource the client wants to get

– the HTTP/1.0 tells the server that the client is using the HTTP/1.0 protocol

• In GET /cs1064/jabowen/ HTTP/1.0 – the /cs1064/jabowen/ is the resource the client wants

to get

– the HTTP/1.0 tells the server that the client is using the HTTP/1.0 protocol

• In each case, the server responds by sending a status line, a number of response headers and the content of the requested resource.

Page 15: Hyper Text Transfer Protocol

Consider the response:

HTTP/1.1 200 OK

...

Content-Type: text/html

<HTML>

...

</HTML>

• The first line, HTTP/1.1 200 OK , is a status line

• The next few lines, ending in the line Content-Type: text/html, are header lines

• The lines bounded by <HTML> and </HTML> form the content of the requested resource.

Page 16: Hyper Text Transfer Protocol

HEAD requests• HEAD requests were new in HTTP/1.0

• A HEAD request is similar to a GET, the only difference being the use of the word HEAD instead of the word GET, for example:

HEAD /cs1064/jabowen/ HTTP/1.0 [RETURN]

[RETURN]

• The server sends the same status line and the same response headers as if it had received a GET request,– but does not send the actual content of the resource mentioned in

the request.

• Thus, human clients can use HEAD requests to – access easily information about a resource on a server

– without being overwhelmed by the mass of detail that would be received if the resource content were sent in the response

Page 17: Hyper Text Transfer Protocol

Example HEAD request

• Suppose, for example, we wanted to see information abouthttp://student.cs.ucc.ie/cs1064/jabowen/

such as its size, when it was last edited, etc.

• We can send the requestHEAD /cs1064/jabowen/ HTTP/1.0

Page 18: Hyper Text Transfer Protocol

Response to example HEAD request:

HTTP/1.1 200 OK

Date: Wed, 13 Dec 2000 12:21:35 GMT

Server: Apache/1.3.14 (Unix) PHP/4.0.3pl1

Last-Modified: Thu, 07 Dec 2000 13:16:18 GMT

ETag: "2160-29c6-3a2f8da2"

Accept-Ranges: bytes

Content-Length: 10694

Connection: close

Content-Type: text/html

Page 19: Hyper Text Transfer Protocol

Analysis of response:• The first line in the response

HTTP/1.1 200 OK

is the status line in which – HTTP/1.1 indicates that the server can use HTTP/1.1

(although it can accept requests in earlier HTTP forms)

– 200 is a code which indicates the status the request was given by the server

– OK is an English language phrase giving the meaning of the status code

• The other lines in the response give information either about the server or the resource:

Page 20: Hyper Text Transfer Protocol

Analysis (contd.)Date: Wed, 13 Dec 2000 12:21:35 GMT

gives date/time of the response

Server: Apache/1.3.14 (Unix) PHP/4.0.3pl1 gives details on server

Last-Modified: Thu, 07 Dec 2000 13:16:18 GMT says when resource was last modified

ETag: "2160-29c6-3a2f8da2"provides a supposedly-unique string to identify this entity

Accept-Ranges: bytesays that this server could serve up pieces of this resource, pieces

specifiable to the nearest byte

Content-Length: 10694 gives the size of the resource

Connection: closesays that the server does not regard this as a persistent connection

Content-Type: text/html gives the type of data in the resource

Page 21: Hyper Text Transfer Protocol

Another example• Suppose, we wanted to learn about the resource with URL

http://student.cs.ucc.ie/cs1064/jabowen/vh40.gif

• We can send the requestHEAD /cs1064/jabowen/vh.gif HTTP/1.0

• Response is: HTTP/1.1 200 OK

Date: Wed, 13 Dec 2000 12:23:04 GMT

Server: Apache/1.3.14 (Unix) PHP/4.0.3pl1

Last-Modified: Fri, 24 Nov 2000 11:46:00 GMT

ETag: "3133-361-3a1e54f8"

Accept-Ranges: bytes

Content-Length: 865

Connection: close

Content-Type: image/gif

Page 22: Hyper Text Transfer Protocol

Cs 607 got here on 21 Jan 2003

Page 23: Hyper Text Transfer Protocol

HTTP/1.1

A (fairly) detailed description

Page 24: Hyper Text Transfer Protocol

• We have just seen some example HTTP/1.0 interactions

• The same kinds of concepts we saw in these interactions will arise as we examine HTTP/1.1 in more detail

• The versions of HTTP have a great deal in common, so, in what follows, much of what is said will be true of all three versions

• Therefore,, any mention of just “HTTP” will mean that the statement applies to HTTP/0.9, HTTP/1.0 and HTTP/1.1

Page 25: Hyper Text Transfer Protocol

Overall Operation of HTTP

• The HTTP protocol is a request/response protocol.• request

– An HTTP message sent by a client to a server

• response– An HTTP message sent by a server to a client which has made a

request.

• client– A program that establishes connections for the purpose of sending

requests.

• server

– A program that accepts connections in order to service requests by sending back responses.

• As we shall see, a program may act as both a client and a server.

Page 26: Hyper Text Transfer Protocol

Message from a client:

A client sends, over a connection, to a server • a request line in the form of

– a request method,

– a URI (Uniform Resource Identifier), and

– a protocol version,

• possibly followed by a message containing – request modifiers,

– information about the client,

• and (possibly) body content.

Page 27: Hyper Text Transfer Protocol

Response from a server:

The server responds with • a status line, in the form of

– the message's protocol version,

– a success or error code and

– an English phrase explaining the code

• possibly followed by a message containing – server information,

– information about the entity in the body content (if any)

• and (possibly) body content.

Page 28: Hyper Text Transfer Protocol

HTTP Communication• Most communication

– is started by a user agent and

– consists of a request to be applied to a resource on some origin server.

• user agent

– A client (browser, spider, etc.) which initiates a request.

• resource– A data object or service that can be identified by a URI.

• origin server

– The server on which a resource resides or is to be created.

Page 29: Hyper Text Transfer Protocol

Cs 607n got here on 11 jan 2005

Page 30: Hyper Text Transfer Protocol

Simple communication

• Involves single connection between user agent (UA) and origin server (O)

• This connection is denoted, in diagrams on this and future slides, by -------

====request chain ==========>

UA -----------------------------------O

<=========response chain====

Page 31: Hyper Text Transfer Protocol

More complicated case • Intermediaries present in request/response chain.

====request chain =======================>

UA ----------- A ----------- B ----------- C ----------- O

<======================response chain==== • Above, 3 intermediaries (A, B, and C) lie between user

agent and origin server. • Intermediaries act as both clients and servers• Request or response message that travels the whole chain

passes through 4 separate connections: UA-A connection;

A-B connection;

B-C connection;

C-O connection

Page 32: Hyper Text Transfer Protocol

Simple versus complicated

• Distinction is important because some HTTP options may apply – only to the connection with the nearest

neighbour, – only to the end-points of the chain, – or to all connections along the chain.

Page 33: Hyper Text Transfer Protocol

3 forms of intermediary

• proxy, an agent which– receives a request for a resource whose URI is in its absolute form

and,

– if necessary, rewrites all or part of the message and forwards the reformatted request toward the server identified by the URI.

• gateway, an agent which– acts as a translation interface to a server for another protocol, such

as WAP, etc.

• tunnel, an agent which – acts as a relay point between two connections without changing

messages;

– tunnels are used, for example, in security firewalls

Page 34: Hyper Text Transfer Protocol

Caching

Page 35: Hyper Text Transfer Protocol

Caching• User agents, proxies and gateways (but not tunnels) may

use a local cache to handle requests, instead of forwarding them on to an origin server

• A request/response chain is shortened if one of the parties along the chain has a cached response applicable to the request.

Page 36: Hyper Text Transfer Protocol

Example Network topology

The example caching scenarios in the next few slides will use this network:

UA3____________D

|

UA2_____ |

| |

| |

UA1_____A______B________C_________O

Page 37: Hyper Text Transfer Protocol

Caching Example 1

====request chain ====================>

UA1 ----------- A ----------- B -------- C --------- O

<==================response chain=====

• In the example above:– the user has made a request for a resource on origin server O

– neither UA1 nor any of the proxies A, B or C has an appropriate cached response

– so the request has been forwarded all the way to O

– Four connections are involved in servicing the request

Page 38: Hyper Text Transfer Protocol

Caching Example 2

request

chain

UA1…………….... A ……... B …….. C …… O

response

chain

• In the example above:– the user has repeated the same request for a resource on O

– UA1 has a cached response to the earlier request and gives this to the user without sending the request anywhere

– No connection is involved in servicing the request

Page 39: Hyper Text Transfer Protocol

Caching Example 3

===request chain =>

UA2 -----------------

UA1 …..……...... A …….. B …….. C ……... O

<=response chain==

• In the example above:– the user at UA2 has requested the same resource on origin server

O that was earlier requested by the user at UA1

– UA2 has forwarded the request to proxy A

– proxy A has an appropriate cached response, from when it serviced the earlier request from UA1

– Only one connection is involved in servicing the request

Page 40: Hyper Text Transfer Protocol

Caching Example 4

===request chain ====>

UA3 ---------- D --------

|

UA1 …..…... A …….. B …….. C ……... O

<===response chain===

• In the example above:– the user at UA3 has requested the same resource on origin server

O that was earlier requested by the user at UA1

– UA3 has forwarded the request to proxy D, which has forwarded it to proxy B

– proxy B has an appropriate cached response, from when it serviced the earlier request from UA1

– Two connections are involved in servicing the request

Page 41: Hyper Text Transfer Protocol

To cache or not?

• Not all responses are usefully cacheable

• As we will see later, some requests may contain modifiers which place special requirements on cache behavior.

• The same is true of responses

Page 42: Hyper Text Transfer Protocol

Cs 607 ngot here on 28 january 2003

Page 43: Hyper Text Transfer Protocol

Caching/Proxy architectures

• A wide variety of cache and proxy architectures/configurations exist, including:– national hierarchies of proxy caches to save inter-national and/or

inter-continental bandwidth,

– systems that broadcast or multicast cache entries,

– organizations that distribute subsets of cached data via CD-ROM,

– and so on.

Page 44: Hyper Text Transfer Protocol

Connections

Page 45: Hyper Text Transfer Protocol

Temporary Connections• In most implementations of HTTP/1.0, a server closed a connection

after it had serviced the request received on that connection:

– We saw this earlier, when the server on student.cs.ucc.ie closed the telnet connection that we had established, after it had sent its response to the HTTP/1.0 GET request we had sent

• The use of inline images, sound files, etc., in web pages often requires a client to make multiple requests of the same server when loading one document

• Thus the temporary connections provided by HTTP/1.0 meant that loading even one web page required many separate TCP connections (one to to fetch each inline image, each sound file etc.)

• This imposed a significant unnecessary load on HTTP servers and caused congestion on the Internet.

Page 46: Hyper Text Transfer Protocol

Advantages of Persistent Connections

Persistent HTTP connections offer a number of advantages: – By opening and closing fewer TCP connections, CPU time is

saved

– HTTP requests and responses can be pipelined on a connection, allowing a client to make multiple requests without waiting for each response

– Network congestion is reduced by reducing the number of packets caused by TCP opens,

– Latency on subsequent requests is reduced since there is no time spent in TCP's connection-opening handshake.

Page 47: Hyper Text Transfer Protocol

Persistent Connections in HTTP/1.1 • Unlike HTTP/1.0 and earlier, persistent connections are

the default behavior of any HTTP/1.1 connection.

• This means that, in HTTP/1.1, when a connection has been opened to service a request, it is kept open for further possible requests from the same client

• This is true even if the initial request triggered an error response from the server

• But, when no further request has been received after some time-out period, the server may close the connection

• However, a client can indicate, when making a request, that it wants the connection closed after the request is serviced

Page 48: Hyper Text Transfer Protocol

Connection Persistency Negotiation

• HTTP/1.1 provides a mechanism by which a client and a server can signal the close of a TCP connection.

– the Connection: header field. • If a HTTP/1.1 client wants a connection closed after it

receives a response to its request, it should include, in the request, a Connection: header containing the token "close" .

• Similarly, if a HTTP/1.1 server intends to close a connection closed after it sends a response to a request, it should include, in the response, a Connection: header containing the token "close" .

• If either the client or the server sends the close token in a Connection: header, that request becomes the last one for the connection.

Page 49: Hyper Text Transfer Protocol

Example 1: Introduction

• A human, using a telnet client, sends a HTTP/1.0 request to a HTTP/1.1 server

• The server assumes that the client, because it is using HTTP/1.0, cannot handle persistent connections and, in its response, signals its intention to close the connection

• After printing the response, the telnet client says that the connection was closed by the foreign host

Page 50: Hyper Text Transfer Protocol

Example 1interzone.ucc.ie> telnet student.cs.ucc.ie 80

Trying 143.239.211.125...

Connected to student.cs.ucc.ie.

Escape character is '^]'.

HEAD /cs1064/jabowen/ HTTP/1.0

HTTP/1.1 200 OK

Date: Sat, 06 Jan 2001 17:56:44 GMT

Server: Apache/1.3.14 (Unix) PHP/4.0.3pl1

Last-Modified: Wed, 20 Dec 2000 11:34:46 GMT

ETag: "2160-2dee-3a409956"

Accept-Ranges: bytes

Content-Length: 11758

Connection: close

Content-Type: text/html

Connection closed by foreign host.

Page 51: Hyper Text Transfer Protocol

Example 2: Introduction

• A human, using a telnet client, sends a HTTP/1.1 request to a HTTP/1.1 server

• The server assumes that the client, because it is using HTTP/1.1, wants a persistent connection– thus, there is no Connection: header in the response

• The telnet client prints the response for the human to see

• After a significant delay (the time-out period), the server realizes the client has no further request and closes the connection

• The telnet client then tells the human that the connection was closed by the foreign host

Page 52: Hyper Text Transfer Protocol

Example 2:interzone.ucc.ie> telnet student.cs.ucc.ie 80

Trying 143.239.211.125...

Connected to student.cs.ucc.ie.

Escape character is '^]'.

HEAD /cs1064/jabowen/ HTTP/1.1

Host: student.cs.ucc.ie

HTTP/1.1 200 OK

Date: Sat, 06 Jan 2001 17:57:08 GMT

Server: Apache/1.3.14 (Unix) PHP/4.0.3pl1

Last-Modified: Wed, 20 Dec 2000 11:34:46 GMT

ETag: "2160-2dee-3a409956"

Accept-Ranges: bytes

Content-Length: 11758

Content-Type: text/html

A time-out period elapses before server closes connection

Connection closed by foreign host.

Page 53: Hyper Text Transfer Protocol

Example 3: Introduction• A human, using a telnet client, sends a HTTP/1.1 request

to a HTTP/1.1 server

• The client knows that, because it is using HTTP/1.1, the server will think it wants a persistent connection

• Since the client does not want a persistent connection it sends a Connection: header with a close token in the request

• Seeing this, the server indicates its intention to close the connection immediately, by including a Connection: header with a close token in its response

• The telnet client prints the response for the human to see and, immediately thereafter, tells the human that the connection was closed by the foreign host

Page 54: Hyper Text Transfer Protocol

Example 3:interzone.ucc.ie> telnet student.cs.ucc.ie 80

Trying 143.239.211.125...

Connected to student.cs.ucc.ie.

Escape character is '^]'.

HEAD /cs1064/jabowen/ HTTP/1.1

Host: student.cs.ucc.ie

Connection: close

HTTP/1.1 200 OK

Date: Sat, 06 Jan 2001 17:57:58 GMT

Server: Apache/1.3.14 (Unix) PHP/4.0.3pl1

Last-Modified: Wed, 20 Dec 2000 11:34:46 GMT

ETag: "2160-2dee-3a409956"

Accept-Ranges: bytes

Content-Length: 11758

Connection: close

Content-Type: text/html

Connection closed by foreign host. (No time-out delay before this from telnet client)

Page 55: Hyper Text Transfer Protocol

Pipelining Requests • A client that supports persistent connections may

"pipeline" its requests (i.e., send multiple requests without waiting for each response).

• A server must send its responses to those requests in the same order that the requests were received.

Page 56: Hyper Text Transfer Protocol

Example 4: Introduction• A human, using a telnet client, sends two HTTP/1.1

requests to a HTTP/1.1 server, sending the second request before it even receives a response to the first request

• Since he has only two requests, the client sends a Connection: header with a close token in the second request

• The server responds to both requests and, because of the close token in the 2nd request, indicates its intention to close the connection immediately, by including a Connection: header with a close token in its response to the 2nd request.

• The telnet client prints the responses for the human to see and, immediately thereafter, tells the human that the connection was closed by the foreign host

Page 57: Hyper Text Transfer Protocol

Example 4: the pipelined requestsinterzone.ucc.ie> telnet student.cs.ucc.ie 80

Trying 143.239.211.125...

Connected to student.cs.ucc.ie.

Escape character is '^]'.

HEAD http://student.cs.ucc.ie/cs1064/jabowen/ HTTP/1.1

Host: student.cs.ucc.ie

HEAD http://student.cs.ucc.ie/cs4400/jabowen/ HTTP/1.1

Host: student.cs.ucc.ie

Connection: close

Page 58: Hyper Text Transfer Protocol

Example 4: the sequence of responses

HTTP/1.1 200 OK

Date: Wed, 31 Jan 2001 20:01:41 GMT

Server: Apache/1.3.14 (Unix) PHP/4.0.3pl1

Last-Modified: Thu, 25 Jan 2001 13:26:32 GMT

ETag: "2160-2e25-3a702988"

Accept-Ranges: bytes

Content-Length: 11813

Content-Type: text/html

HTTP/1.1 200 OK

Date: Wed, 31 Jan 2001 20:01:41 GMT

Server: Apache/1.3.14 (Unix) PHP/4.0.3pl1

Last-Modified: Wed, 20 Dec 2000 12:42:39 GMT

ETag: "13d3a-2b60-3a40a93f"

Accept-Ranges: bytes

Content-Length: 11104

Connection: close

Content-Type: text/html

Connection closed by foreign host. (No time-out delay before this message from telnet client)

Page 59: Hyper Text Transfer Protocol

Pipelining Requests (contd.)

• Clients which assume persistent connections and pipeline immediately after connection establishment should be prepared to retry their connection if the first pipelined attempt fails.

• If a client does such a retry, it must NOT pipeline before it knows the connection is persistent.

• Clients must also be prepared to resend their requests if the server closes the connection before sending all of the corresponding responses.

Page 60: Hyper Text Transfer Protocol

Pipelining Requests (contd.)

• Care must be taken when pipelining – because some requests (called non-idempotent requests) may

change the state of the server (for example, by changing a database used by the server)

• Clients should NOT pipeline such requests

• Otherwise, a premature termination of the transport connection could lead to indeterminate results.

• A client wishing to send a non-idempotent request should wait to send that request until it has received the response status for the previous request.

Page 61: Hyper Text Transfer Protocol

Cs 607 got here on 18 jan 2005

Page 62: Hyper Text Transfer Protocol

Uniform Resource Identifiers

Page 63: Hyper Text Transfer Protocol

Uniform Resource Identifiers

• URIs have been known by many names: – WWW addresses,

– Universal Document Identifiers,

– Universal Resource Identifiers,

– Uniform Resource Locators (URL)

– Uniform Resource Names (URN).

• For HTTP, URIs are simply formatted strings which identify (by name, location, or any other characteristic) a resource.

Page 64: Hyper Text Transfer Protocol

CS 607 got here on 4/Feb/2003

Page 65: Hyper Text Transfer Protocol

General URI Syntax• URIs in HTTP can be represented in absolute form or

relative to some known base URI.

• The two forms are differentiated by the fact that absolute URIs always begin with a scheme name followed by a colon.

Page 66: Hyper Text Transfer Protocol

http-scheme URIs • A URI which is based on the http scheme must be of the

syntactic form

"http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]

where items enclosed in [] are optional

• If the port is not given, Port 80 is assumed.

Page 67: Hyper Text Transfer Protocol

Meaning of a http-scheme URI

• The meaning of a http-scheme URL is that the identified resource is on the server at that port of the host, and the Request-URI for the resource is abs_path.

• Thus, for example, pointing a browser at

http://student.cs.ucc.ie/cs1064/jabowen/

is the same as opening a TCP/IP connection to Port 80 on student.cs.ucc.ie and sending either

– the HTTP/1.0 request

GET /cs1064/jabowen/ HTTP/1.0

– or the HTTP/1.1 request

GET /cs1064/jabowen/ HTTP/1.1

Host: student.cs.ucc.ie

• (As we shall see later, all HTTP/1.1 requests must include a Host: header field)

Page 68: Hyper Text Transfer Protocol

Meaning of a http-scheme URI (contd.)• The lectures on HTML forms given earlier in this course used a

method called POST to send user-supplied data to a server

• The POST method was not defined in HTTP/0.9, which only provided one method, the GET method

• The convention used in HTTP/0.9 to send data to a server was to encode the data in the Request-URI, in the form of a query at the end

• The convention is still supported in HTTP/1.1. Consider the following form:

<form action="http://student.cs.ucc.ie/myProg.cgi" method="get">

Home town: <input type="text" name="hometown” >

<button type="submit"> Send data </button>

</form>

• If the user entered “cork” in the input box and submitted the form, the browser’s request would include this request line:

GET http://student.cs.ucc.ie/myProg.cgi?hometown=cork HTTP/1.1

Page 69: Hyper Text Transfer Protocol

Meaning of a http-scheme URI (contd.)• The query in a URL can include several “equations”. Consider the

following form:

<form action="http://student.cs.ucc.ie/myProg.cgi" method="get">

Surname: <input type="text" name=”surname” >

Home town: <input type="text" name="hometown” >

<button type="submit"> Send data </button>

</form>

• If the user entered “sullivan” and “cork” in the input boxes and submitted the form, the browser’s request would include this request line:

GET http://student.cs.ucc.ie/myProg.cgi?surname=sullivan&hometown=cork HTTP/1.1

• “Equations” in a query are separated by the & character

Page 70: Hyper Text Transfer Protocol

Meaning of a http-scheme URI (contd.)• Some characters in user-supplied data have to be specially handled when

a browser is writing the query in a Request-URI

• The following characters, called the “reserved” characters, have a special usage in URIs:

: / @ ? & = ;

• They have to be “URL encoded”, to send them in URI query

• Consider the following form:

<form action="http://abc.com/prog.cgi" method="get">

Name of company: <input type="text" name=“company” >

Home town: <input type="text" name=”place” >

<button type="submit"> Send data </button> </form>

• If the user entered “Black&Decker” and “Cork” in the input boxes, the browser’s request would include this request line

GET http://abc.com/prog.cgi?company=Black%26Decker&place=Cork HTTP/1.1

where the URL encoded form of & is %26, the 26 being the headecimal ASCII code for &

Page 71: Hyper Text Transfer Protocol

Meaning of a http-scheme URI (contd.)• URL escape codes for the “reserved” characters:

colon %3A slash %2F

at (“@”) %40 question-mark %3F

equals %3D ampersand %26

semi-colon(;) %3B

Page 72: Hyper Text Transfer Protocol

CS4400 got to here at 16:00 ON 7/12/2001

Page 73: Hyper Text Transfer Protocol

Meaning of a http-scheme URI (contd.)• The following characters, called the “unsafe” characters, should also

be URL-encoded in URIs, using the hex codes specified:

space %20 quotation mark %22

less than %3C greater than %3E

hash (“#”) %23 percent %25

left brace %7B right brace %7D

pipe (“|”) %7C Backslash %5C

Caret (“^”) %5E Tilde %7E

Left Sq Bracket % 5B Right Sq Bracket %5D

Grave accent (“`”) %60

• These characters are unsafe for different reasons

Page 74: Hyper Text Transfer Protocol

Length of URI

• Since a browser which is sending user-supplied data to a server includes these data in the query part of a URL, URLs can get quite long

• The HTTP protocol does not place any a priori limit on the length of a URI:– Servers must be able to handle the URI of any resource they

serve up

– Servers should be able to handle URIs of unbounded length if they serve up GET-based forms that could generate such URIs.

– A server should return 414 (Request-URI Too Long) status if a URI is longer than the server can handle.

Page 75: Hyper Text Transfer Protocol

Host names in http-scheme URIs

• A fully-qualified host name of a host means

– either the fully-qualified domain name (i.e., a completely specified domain name ending in a top-level domain such as .com or .ie),

– or the numeric Internet Protocol (IP) address of the host.

• The fully qualified domain name is preferred; use of numeric IP addresses in URIs is strongly discouraged. and should be avoided whenever possible.

Page 76: Hyper Text Transfer Protocol

Proxy handling of host names

• If a proxy receives a fully qualified domain name, the proxy must NOT change the host name.

• But, if a proxy receives a host name which is not a fully qualified domain name, it may add its domain to the host name it received.

Page 77: Hyper Text Transfer Protocol

Example• Suppose we use the host name “cosmos” in

a URL sent to the proxy “student.cs.ucc.ie”• Then the proxy can extend this to

“cosmos.cs.ucc.ie”• EGhttp://cosmos/jabowen/prog1.phpbecomeshttp://cosmos.cs.ucc.ie/jabowen/prog1.php

Page 78: Hyper Text Transfer Protocol

http://www.independent.co.uk

http://www.independent.co.uk/printer.php?storyID=14356