Internet Engineering Course Web Servers. Introduction Company needs to provide various web services...

54
Internet Engineering Course Web Servers

Transcript of Internet Engineering Course Web Servers. Introduction Company needs to provide various web services...

Page 1: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Internet Engineering Course

Web Servers

Page 2: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

IntroductionCompany needs to provide various

web services◦Hosting intranet applications◦Company web site◦Various internet applications

Therefore there is a need to provide http server◦First we have a look at what http protocol

is◦Then we talk about Web Servers and

Apache as leading web server application

Page 3: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

The World Wide Web (WWW) Global hypertext system Initially developed in 1989

◦ By Tim Berners Lee at the European Laboratory for Particle Physics, CERN in Switzerland.

◦ To facilitate an easy way of sharing and editing research documents among a geographically dispersed groups of scientists.

In 1993, started to grow rapidly◦ Mainly due to the NCSA developing a Web browser

called Mosaic (an X Window-based application) First graphical interface to the Web More convenient

browsing Flexible way people can navigate through worldwide

resources in the Internet and retrieve them

Page 4: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Web Browsers

Provides access to a Web server

Basic components◦HTML interpreter◦HTTP client used to

retrieve HTML pagesSome also support

◦FTP, NTTP, POP, SMTP, …

Page 5: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Web ServersDefinitions

◦A computer, responsible for accepting HTTP requests from clients, and serving them Web pages.

◦A computer program that provides the above mentioned functionality.

Common features◦Accepting HTTP requests from the network◦Providing HTTP response to the requester

Typically consists of an HTML

◦Usually capable of logging Client requests/Server responses

Page 6: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Web Servers cont.Returned content

◦Static Comes from an existing file

◦Dynamic Dynamically generated by some other

program/script called by the Web server.

Path translation◦Translate the path component of a

URL into a local file system resource Path specified by the client is relative to

the server’s root dir

Page 7: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Overall organization of the Web.

• Basic function operation is to fetch documents– Client issues requests, browser displays document– Server responsible for retrieving document from local file system

• Client/server communications based on HTTP protocol

Basic Client/Server Architecture in WWW

Page 8: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Dynamic Content

Parts of documents may be specified via scripts/programs

Client-side (executed on client machine, e.g., within the browser)◦ Client-side script - Script embedded in html

document◦ Applet - pre-compiled program passed to

clientServer-side (executed on server

machine)◦ Server-side script embedded in document◦ Servelet - precompiled program executed

within the server’s address space◦ CGI scripts

Page 9: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

The principle of using server-side CGI programs.• Allows documents can be generated dynamically “on-the-fly”• Provides a standard way for web server to execute a program

using user-provided data as input• To the server, CGI program appears as program responsible for

fetching the requested document

Common Gateway Interface (CGI)

Page 10: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Architectural OverviewArchitectural details of a client and server in

the Web.

• Document fetch (and possibly server-side script): 2b-3b• Execute CGI Script (separate process): 2c-3c-4c• Execute servlet program (run within server): 2a-3a-4a

Page 11: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

http protocolDefines the communication between a

web server and a clientUsed to deliver virtually all files and

other data (collectively called resources) on the World Wide Web

A browser is an HTTP client because it sends requests to an HTTP server (Web server

The standard (and default) port for HTTP servers to listen on is 80, though they can use any port.

Page 12: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Structure of http transactionsRequest/Response, text based

protocolFormat of a http message:

<initial line, different for request vs. response>

Header1: value1 Header2: value2 Header3: value3 <optional message body goes here, like

file contents or query data; it can be many lines long, or even binary data >

Page 13: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

13

The Format of a RequestThe Format of a Request

method sp URL sp versionheader

cr lf: value cr lf

header : value cr lfcr lf

Entity Body

headerslines

Page 14: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

14

Request ExampleGET /index.html HTTP/1.1 [CRLF]Accept: image/gif, image/jpeg [CRLF]User-Agent: Mozilla/4.0 [CRLF]Host: www.ui.ac.ir:80 [CRLF]Connection: Keep-Alive [CRLF][CRLF]

Page 15: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Request Example

GET /index.html HTTP/1.1Accept: image/gif, image/jpegUser-Agent: Mozilla/4.0Host: www.ui.ac.ir:80Connection: Keep-Alive[blank line here]

methodrequest URL

version

headers

Page 16: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

16

The Format of a ResponseThe Format of a Response

version spstatus codesp phraseheader

cr lf: value cr lf

header : value cr lfcr lf

Entity Body

headerslines

statusline

Page 17: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

17

HTTP/1.0 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Length: 1354

<html> <body> <h1>Hello World</h1> (more file contents) . . . </body> </html>

Response Example

Page 18: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

18

HTTP/1.0 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Length: 1354

<html> <body> <h1>Hello World</h1> (more file contents) . . . </body> </html>

Response Exampleversion

message body

headers

reason phrasestatus code

Page 19: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Initial line A typical initial request line:

◦ GET /path/to/file/index.html HTTP/1.0 Initial response line:

◦ HTTP/1.0 200 OK ◦ HTTP/1.0 404 Not Found

Status code:◦ 1xx indicates an informational message only ◦ 2xx indicates success of some kind ◦ 3xx redirects the client to another URL ◦ 4xx indicates an error on the client's part ◦ 5xx indicates an error on the server's part

Common status codes:◦ 200 OK ◦ 404 Not Found ◦ 301 Moved Permanently ◦ 302 Moved Temporarily ◦ 303 See Other (HTTP 1.1 only) ◦ 500 Server Error

Page 20: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Header linesTypical request headers:

◦From: email address of requester◦User-Agent: for example User-agent: Mozilla/3.0Gold

Typical response headers:◦Server: for example Server: Apache/1.2b3-dev

◦Last-modified: for example Last-Modified: , 19 Feb 2006 23:59:59 GMT

Page 21: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Message bodyIn a response, this is where the requested

resource is returned to the client (the most common use of the message body), or perhaps explanatory text if there's an error.

In a request, this is where user-entered data or uploaded files are sent to the server.

If an HTTP message includes a body, there are usually header lines in the message that describe the body. In particular, ◦The Content-Type: header gives the MIME-type of

the data in the body, such as text/html or image/gif.

◦The Content-Length: header gives the number of bytes in the body.

Page 22: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

MIME Media typesMultipurpose Internet Mail ExtensionsHTTP sends the media type of the file

using the Content-Type: headerSome important media types are

◦ text/plain, text/html◦ image/gif, image/jpeg◦ audio/basic, audio/wav◦ model/vrml◦ video/mpeg, video/quicktime◦ application/*, application-specific data that

does not fall under any other MIME category, e.g. application/octet-stream

Page 23: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Sample HTTP exchangeTo retrieve the file at the URL

http://www.somehost.com/path/file.html Request:

GET /path/file.html HTTP/1.0 From: [email protected] User-Agent: HTTPTool/1.0 [blank line here]

Response:HTTP/1.0 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Length: 1354 <html> <body> <h1>Happy New

Millennium!</h1> (more file contents) . . . </body> </html>

Page 24: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

HTTP methods GET: request a resource by url HEAD

◦ is just like a GET request, except it asks the server to return the response headers only, and not the actual resource (i.e. no message body).

◦ This is useful to check characteristics of a resource without actually downloading it, thus saving bandwidth.

POST◦ A POST request is used to send data to the server

to be processed in some way, like by a CGI script.◦ There's a block of data sent with the request, in the

message body. There are usually extra headers to describe this message body, like Content-Type: and Content-Length:.

◦ The request URI is not a resource to retrieve; it's usually a program to handle the data you're sending.

◦ The HTTP response is normally program output, not a static file.

Page 25: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

HTTP 1.1It is a superset of HTTP 1.0. Improvements

include:◦Faster response, by allowing multiple

transactions to take place over a single persistent connection.

◦Faster response and great bandwidth savings, by adding cache support.

◦Faster response for dynamically-generated pages, by supporting chunked encoding, which allows a response to be sent before its total length is known.

◦Efficient use of IP addresses, by allowing multiple domains to be served from a single IP address.

Page 26: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

26

Manually Experimentingwith HTTP

>telnet eng.ui.ac.ir 80Trying 192.168.50.84…Connected to eng.ui.ac.irEscape character is ‘^]’.

Page 27: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

27

Sending a Request

> GET /~ladani/index.htm HTTP/1.0

[blank line]

Page 28: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

28

The ResponseHTTP/1.1 200 OKDate: Fri, 29 Feb 2008 08:23:33 GMTServer: Apache/2.0.52 (CentOS)Last-Modified: Wed, 07 Nov 2007 12:27:44 GMTETag: "6ccb6-741c-43e55e05a5000"Accept-Ranges: bytesContent-Length: 29724Connection: closeContent-Type: text/html; charset=WINDOWS-1256<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <html> <head> <metahttp-equiv="Content-Type" content="text/html; charset=windows-

1252"> <meta name="GENERATOR" content="Microsoft FrontPage 5.0">

….

Page 29: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

29

GET /~ladani/index.htm HTTP/1.0

HTTP/1.1 200 OK

HTML code

Page 30: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

30

GET /~ladani/no-such-page.htm HTTP/1.0

HTTP/1.1 404 Not Found

HTML code

Page 31: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

31

GET /index.html HTTP/1.1

HTTP/1.1 400 Bad Request

HTML code

Why is it a Bad Request?

HTTP/1.1 without Host Header

Page 32: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Session-persistent State What does session-persistent state mean?

◦ State information that is preserved between browsing sessions.

◦ Information that is stored semi-permanently (i.e., on disk) for later access.

Why was calculator example not session-persistent?◦ Sum, current display, etc. not preserved if we

went to a different website and back to calculator.

Page 33: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Why session-persistence?User-based customizations.

◦MyYahoo, E*Trade, etc.Long transactions.

◦Electronic shopping carts.◦Order preparation

Server-side state maintenance.◦Large amounts of state info that you

don’t want to pass back and forth.

Page 34: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Cookie OverviewHTTP cookies are a mechanism for

creating and using session-persistent state.

Cookies are simple string values that are associated with a set of URL’s.

Servers set cookies using an HTTP header.

Client transmits the cookie as part of HTTP request whenever an associated URL is visited in the future.

Page 35: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Anatomy of a cookie.Cookie has 6 parts:

◦ Name◦ Value◦ Domain◦ Path ◦ Expiration◦ Security flag

Name and Value are required, others have default value.

Page 36: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Setting a cookie.A cookie is set using the “Set-

cookie” header in an HTTP response.

String value of the Set-cookie header is parsed into semi-colon separated fields that define the different parts of the cookie.

Cookie is stored by the client.

Page 37: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Sending cookiesEvery time a client makes an HTTP

request, it tests every cookie for a match.

Cookies match if…◦ Cookie domain is suffix of URL server. ◦ Cookie expiration has not passed.◦ Cookie path is prefix of URL path.◦ Cookie security flag is on and connection is

secure.If a match is made, then name/value

pair of cookie is sent as “Cookie” header in request.

Page 38: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Setting a CookieFull cookie:Set-Cookie: my_cookie = This is my cookie value; domain=.eng.ui.ac.ir; path=/~ladani; expires Thu, 06-March-08 12:00:00 GMT

Can have more than one Set-Cookie header, or can combine more than one cookie in one header by separating with ,

Page 39: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Cookie MatchingBiggest misunderstanding:

◦Servers do not RETRIEVE cookies!!!!◦Servers RECEIVE cookies previously

planted.Step 1:

◦Some response by server installs cookie with “Set-cookie” header.

◦Client saves cookie to disk.

Page 40: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Cookie MatchingStep 2:

◦Browser goes to some page which matches previously received cookie.

◦Cookie name and value sent in request as “Cookie” HTTP header.

Step 3:◦CGI program detects presence of

cookie and uses it. Where is the cookie info?

Environment variable HTTP_COOKIE

Page 41: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Where are cookies stored on client?

Client-specific locations.No standard.Latest IE stores in a folder

called “Temporary Internet Files”◦Each cookie stored in a separate

file.Netscape stores in

“cookies.txt”

Page 42: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Typical Cookie UsagesCookies as Database Index

◦Most common use of cookies.◦State information is kept in some

sort of database and the cookie acts as an index.

Cookies as State Variables◦Name of cookie is like variable name.◦Value of cookie is state information.

Page 43: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Cookie SecuritySecurity flag restricts when

browser will send a cookie back to server.◦Requires “secure” connection.

For example: https in effect.

What does this mean about when the cookies was set?

Page 44: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

First Web ServerBerners-Lee wrote two programs

◦A browser called WorldWideWeb◦The world’s first Web server, which

ran on NeXSTEP The machine is on exhibition at CERN’s

public museum

Page 45: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Most Famous Web ServersApache HTTP Server from Apache

Software FoundationInternet Information Services (IIS)

from MicrosoftGoogle Web Server (GWS)

◦Started from May 2007Lighttpd

◦powers several popular Web 2.0 sites like YouTube, wikipedia and meebo

Page 46: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Web Servers Usage – StatisticsThe most popular Web servers, used for

public Web sites, are tracked by Netcraft Web Server Survey◦Details given by Netcraft Web Server Reports

Apache is the most popular since April 1996

Currently (February 2008) about◦50.93% Apache◦35.56 % Microsoft (IIS, PWS, etc.)◦5.16 % Google◦0.99% Lighttpd

Page 47: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Web Servers Usage – Statistics cont.

Total Sites Across All Domains August 1995 - February 2008

Page 48: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Web Servers Usage – Statistics cont.

Market Share for Top Servers Across All Domains August 1995 - February 2008

Page 49: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Web Servers Usage – Statistics cont.

Totals for Active Servers Across All DomainsJune 2000 - February 2008

Page 50: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Apache (A PAtCHy) Web Server

Origins: NCSA (Univ. of Illinois,Urbana/Champaign)

Now: Apache Software Foundation (www.apache.org), developers world-wide

Most widely used web server today [NetCraft web survey, 2/2008]

Open source software◦ Geographically distributed developers◦ Modular, extensible design needed where third-party

developers could override or extend basic characteristics

Page 51: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Web Server Processing Steps

Accept ClientConnection

Read HTTPRequest Header

FindFile

Send HTTPResponse Header

Read FileSend Data

Page 52: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Apache HTTP Server

Apache Core◦ Receives client request◦ Typically, allocate new process for each incoming request◦ Allocates request record◦ Invokes handlers on individual modules in sequence

Modules register handlers during configuration Handler

◦ Request record passed as single parameter◦ Each handler reads/modifes request record

Page 53: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Web Server PhasesApache core invokes a handler for each

phase Resolve document reference (URI) to a

local file name (or CGI program+parameters)

Client authentication (verify client identity)

Client access control (determine access rights)

Request access control (check if access allowed)

MIME type determination of the response

General phase for handling leftovers (e.g., check syntax of returned response, build up user profile)

Transmission of the response to clientLogging data on the processing of the

request

Page 54: Internet Engineering Course Web Servers. Introduction Company needs to provide various web services ◦ Hosting intranet applications ◦ Company web site.

Referenceshttp://www.jmarshall.com/easy/http/TCP/IP Tutorial and Technical Overview,

Rodriguez, Gatrell, Karas, Peschke, IBM redbooks, August 2001

Wikipedia, the free encyclopediaApache: The Definitive Guide, 2nd edition,

Ben Laurie, Peter Laurie, O’Reilly, February 1999

Webmaster in a nutshell, 1st edition, Stephen Spainhour, Valerie Quercia, O’Reilly, October 1996

Netcraft: February 2006 Web Server Survey