Lecture 13 Dynamic Web Servers & Common Gateway Interface CPE 401 / 601 Computer Network Systems...

55
Lecture 13 Dynamic Web Servers & Common Gateway Interface CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    214
  • download

    1

Transcript of Lecture 13 Dynamic Web Servers & Common Gateway Interface CPE 401 / 601 Computer Network Systems...

Lecture 13

Dynamic Web Servers &Common Gateway Interface

CPE 401 / 601Computer Network Systems

slides are modified from Dave Hollinger

Web Server

Talks HTTP

Looks at METHOD, URI to determine what the client wants.

For GET, URI often is just the path of a file relative to some directory on the web server

2Dynamic Web Servers

GET /foo/blah

Dynamic Web Servers 3

usr bin www etc

foo fun gif

/

blah

In the good old days...

Years ago WWW was made up of (mostly) static

documents. Each URL corresponded to a single file stored

on some hard disk.

Today Many of the documents on the WWW are

built at request time. URL doesn’t correspond to a single file.

4Dynamic Web Servers

Dynamic Documents

Dynamic Documents can provide: automation of web site maintenance

customized advertising

database access

shopping carts

date and time service

5Dynamic Web Servers

Web Programming

Writing programs that create dynamic documents has become very important.

There are a number of general approaches: Create custom server for each service desired.

• Each is available on different port.

Have web server run external programs.

Develop a real smart web server

• SSI, scripting, server APIs.

6Dynamic Web Servers

Custom Server

Write a TCP server that watches a “well known” port for requests.

Develop a mapping from http requests to service requests.

Send back HTML (or whatever) that is created/selected by the server process.

Have to handle http errors, headers, etc.

7Dynamic Web Servers

An Example Custom Server

We want to provide a time and date service.

Anyone in the world can find out the date and time according to our computer!!!

We don’t care what is in the http request, our reply doesn’t depend on it.

We assume the request comes from a browser that wants the content formatted as an HTML document.

8Dynamic Web Servers

Listen on a well known TCP port.

Accept a connection.

Find out the current time and date

Convert time and date to a string

Send back some http headers (Content-Type)

Send the string wrapped in HTML formatting.

Close the connection.

WWW based time and date server

9

loop forever

Dynamic Web Servers

Another Example: Counter

Keep track of how many times our server is hit each day.

Report on the number of hits our server got on any day in the past!

The reply now does depend on the request.

We have to remember that the request comes from a HTTP client, so we need to accept HTTP requests.

10Dynamic Web Servers

Time & Date Hit Server

Each request comes as a string (URI) specifying a resource.

Our requests will look like this:

/mm/dd/yyyy

An example URL for our service:http://www.timedate.com:4567/02/10/2000

We will get a request like:GET /02/10/2000 HTTP/1.1

11Dynamic Web Servers

New code

Record the “hit” in database. Read request - parse request to

month,day,year Lookup hits for month,day,year in

database. Send back some http headers (Content-

Type) Create HTML table and send back to client. Close the connection.

12Dynamic Web Servers

Drawbacks to Custom Server Approach

We might have lots of ideas custom services. Each requires dedicated address (port)

Each needs to include:

• basic TCP server code

• parsing HTTP requests

• error handling

• headers

• access control

13Dynamic Web Servers

Another Approach

Take a general purpose Web server (that can handle static documents) and have it process requested documents as it

sends them to the client.

The documents could contain commands that the server understands the server includes some kind of interpreter.

14Dynamic Web Servers

Example Smart Server

Have the server read each HTML file as it sends it to the client.

The server could look for this:

<SERVERCODE> some command </SERVERCODE>

The server doesn’t send this part to the client, instead it interprets the command and sends the result to the client.

Everything else is sent normally.

15Dynamic Web Servers

Example Document<TITLE>timedate.com Home Page</TITLE><H1 ALIGN=CENTER>Welcome to timedate.com</H1><SERVERCODE> include fancygraphic </SERVERCODE>

The current time is <SERVERCODE> time </SERVERCODE>.<P>

Today is <SERVERCODE> date </SERVERCODE>.

Visit our sponser: <SERVERCODE> random sponsor </SERVERCODE>

16Dynamic Web Servers

Real Life - Server Side Includes

Many real web servers support this idea but not the syntax we’ve shown.

Server Side Includes (SSI) provides a set of commands that a server will interpret.

Typically the server is configured to look for commands only in specially marked documents so normal documents aren’t slowed down

17Dynamic Web Servers

SSI Directives

SSI commands are called directives

Directives are embedded in HTML comments.

A comment looks like this:

<!-- this is an HTML comment -->

A directive looks like this:<!--#command parameter=“arg”-->

18Dynamic Web Servers

Some SSI Directives

SSI servers keep a number of useful things in environment variables:

DOCUMENT_NAME, DOCUMENT_URL

echo: inserts the value of an environment variable into the page.

This page is located at

<!--#echo var=“DOCUMENT_URL”-->.

19Dynamic Web Servers

SSI Directives include: inserts the contents of a text file.

<!--#include file=“banner.html”>

flastmod: inserts the time and date that a file was last modified.

Last modified:

<!--#flastmod file=“foo.html”>

20Dynamic Web Servers

SSI Directives (cont.)

exec: runs an external program and inserts the output of the program.

Current users:

<!--#exec cmd=“/usr/bin/who”>

21

Danger! Danger! Danger!

Dynamic Web Servers

More Power

Some servers support elaborate scripting languages.

Scripts are embedded in HTML documents, the server interprets the script: Microsoft Active Server Pages (ASP)

• JScript, VBScript, PerlScript Netscape LiveWire

• JavaScript, SQL connection library. There are others...

22Dynamic Web Servers

Server Mapping and APIs

Some servers include a programming interface that allows us to extend the capabilities of the server by writing modules.

Specific URLs are mapped to specific modules instead of to files.

We could write our timedate.com server as a module and merge it with the web server.

23Dynamic Web Servers

External Programs

Another approach is to provide a standard interface between external programs and web servers. We can run the same program from any web

server.

The web server handles all the http,

• we focus on the special service only.

It doesn’t matter what language we use to write the external program.

24Dynamic Web Servers

Common Gateway Interface

CGI is a standard interface to external programs supported by most (if not all) web servers.

The interface that is defined by CGI includes: Identification of the service

• external program

Mechanism for passing the request to the external program.

25Dynamic Web Servers

CGI Programming

We will focus on CGI programming.

CGI programs are often written in scripting languages (perl, tcl, etc.), we will concentrate on C

27CGI

CGI Programming

28

CLIENT

HTTPSERVER

CGI Program

http request

http response

setenv(), dup(),

fork(), exec(), ...

CGI

Common Gateway Interface

CGI is a standard mechanism for:

Associating URLs with programs that can be run by a web server.

A protocol (of sorts) for how the request is passed to the external program.

How the external program sends the response to the client.

29CGI

CGI URLs

There is some mapping between URLs and CGI programs provided by a web sever. The exact mapping is not standardized

• web server admin can set it up

Typically: requests that start with /CGI-BIN/ , /cgi-bin/ or /cgi/, etc. refer to CGI programs

• not to static documents.

30CGI

Request CGI program

The web server sets some environment variables with information about the request.

The web server fork()s and the child process exec()s the CGI program.

The CGI program gets information about the request from environment variables.

31CGI

STDIN, STDOUT

Before calling exec(), the child process sets up pipes so that stdin comes from the web server and stdout goes to the web server.

In some cases part of the request is read from stdin.

Anything written to stdout is forwarded by the web server to the client.

32CGI

33

HTTPSERVER

CGI Program

stdin

stdout

EnvironmentVariables

CGI

Important CGIEnvironment Variables

REQUEST_METHOD

QUERY_STRING

CONTENT_LENGTH

34CGI

Request Method: Get

GET requests can include a query string as part of the URL:

GET /cgi-bin/login?mgunes HTTP/1.0

35

RequestMethod

ResourceName

Delimiter

QueryString

CGI

/cgi-bin/login?mgunes

The web server treats everything before the ‘?’ delimiter as the resource name

In this case the resource name is the name of a program.

Everything after the ‘?’ is a string that is passed to the CGI program.

36CGI

Simple GET queries - ISINDEX

You can put an <ISINDEX> tag inside an HTML document.

The browser will create a text box that allows the user to enter a single string.

If an ACTION is specified in the ISINDEX tag, when the user presses Enter, a request will be sent to the server specified

as the ACTION.

37CGI

ISINDEX Example

Enter a string:

<ISINDEX ACTION=http://foo.com/search.cgi>

Press Enter to submit your query.

If you enter the string “blahblah”, the browser will send a request to the http

server at foo.com that looks like this:

GET /search.cgi?blahblah HTTP/1.1

38CGI

What the CGI sees

The CGI Program gets REQUEST_METHOD using getenv:

char *method;

method = getenv(“REQUEST_METHOD”);

if (method==NULL) … /* error! */

39CGI

Getting the GET

If the request method is GET:if (strcasecmp(method,”get”)==0)

The next step is to get the query string from the environment variable QUERY_STRING

char *query;

query = getenv(“QUERY_STRING”);

40CGI

Send back http Response and Headers:

The CGI program can send back a http status line :

printf(“HTTP/1.1 200 OK\r\n”);

and headers:printf(“Content-type: text/html\r\n”);

printf(“\r\n”);

41CGI

Important! CGI program doesn’t have to send a status

line the http server will do this for you if you don’t.

CGI program must always send back at least one header line indicating the data type of the content (usually text/html).

The web server will typically throw in a few header lines of it’s own Date, Server, Connection

42CGI

Simple GET handlerint main() {

char *method, *query;

method = getenv(“REQUEST_METHOD”);

if (method==NULL) … /* error! */

query = getenv(“QUERY_STRING”);

printf(“Content-type: text/html\r\n\r\n”);

printf(“<H1>Your query was %s</H1>\n”,

query);

return(0);

}43CGI

URL-encoding Browsers use an encoding when sending

query strings that include special characters. Most nonalphanumeric characters are encoded

as a ‘%’ followed by 2 ASCII encoded hex digits.• ‘=‘ (which is hex 3D) becomes “%3D”

• ‘&’ becomes “%26”

The space character ‘ ‘ is replaced by ‘+’.

• Why? (think about project 2 parsing…)

The ‘+’ character is replaced by “%2B”• “foo=6 + 7” becomes “foo%3D6+%2B+7”

44CGI

Security!!!

It is a very bad idea to build a command line containing user input!

What if the user submits: “ ; rm -r *;”

grep ; rm -r *; /usr/dict/words

45CGI

Beyond ISINDEX - Forms

Many Web services require more than a simple ISINDEX.

HTML includes support for forms: lots of field types user answers all kinds of annoying questions entire contents of form must be stuck together

and put in QUERY_STRING by the Web server.

46CGI

Form Fields Each field within a form has a name and a

value. The browser creates a query that

includes a sequence of “name=value” substrings and

sticks them together separated by the ‘&’ character.

If user types in “Mehmet H.” as the name and “none” for occupation, the query would look like this:

“name=Mehmet+H%2E&occupation=none”47CGI

HTML Forms

Each form includes a METHOD that determines what http method is used to submit the request.

Each form includes an ACTION that determines where the request is made.

48CGI

An HTML Form

<FORM METHOD=GET ACTION=http://foo.com/signup.cgi>

Name: <INPUT TYPE=TEXT NAME=name><BR>Occupation: <INPUT TYPE=TEXT NAME=occupation><BR><INPUT TYPE=SUBMIT></FORM>

49CGI

What a CGI will get

The query (from the environment variable QUERY_STRING) will be a URL-encoded string containing the

name,value pairs of all form fields.

The CGI must decode the query and separate the individual fields.

50CGI

HTTP Method: POST

The HTTP POST method delivers data from the browser as the content of the request.

The GET method delivers data (query) as part of the URI.

HTML Form using POST Set the form method to POST instead of GET.

<FORM METHOD=POST ACTION=…>51CGI

GET vs. POST

When using forms it’s generally better to use POST: there are limits on the maximum size of a

GET query string (environment variable)

a post query string doesn’t show up in the browser as part of the current URL.

52CGI

CGI reading POST

If REQUEST_METHOD is a POST, the query is coming in STDIN.

The environment variable CONTENT_LENGTH tells us how much data to read.

53CGI

Possible Problemchar buff[100];

char *clen = getenv(“CONTENT_LENGTH”);

if (clen==NULL)

/* handle error */

int len = atoi(clen);

if (read(0,buff,len)<0)

… /* handle error */

pray_for(!hacker);

54CGI

CGI Method summary

GET: REQUEST_METHOD is “GET” QUERY_STRING is the query

POST: REQUEST_METHOD is “POST” CONTENT_LENGTH is the size of the query query can be read from STDIN

55CGI