Communicating on the web

44
Communicating on the Web This talk at: http://joind.in/10392

description

HTTP (Hyper Text Transfer Protocol) regulates simple conversations between clients and servers, like placing an order in a restaurant. However, there are some gotchas like the server having short term memory requiring the client to repeat themselves. But don’t despair, HTTP helps reduce confusion with standardized requests and responses. By following these conventions developers are able to create amazing things not possible with just POST requests and 200 OK responses. In this talk Adrian Cardenas will review examples of clients and servers, as well as the stateless nature of HTTP. He will then go into more detail about headers discussing request methods, and common request headers. Good conversations cannot be one sided, so he will also cover common response headers as well as useful response status codes.

Transcript of Communicating on the web

Page 1: Communicating on the web

Communicating on the Web

This talk at: http://joind.in/10392

Page 2: Communicating on the web

● Developer at ServerGrove● All around nerd● Systems Administrator for

7 years● @aramonc in all the places

About Me

Page 3: Communicating on the web

CAN’T COMMUNICATE WELL WITHOUT

COMMON GROUND

Page 4: Communicating on the web

HYPERTEXTTRANSFER PROTOCOL● Designed side by side with HTML● Before were the bulletin boards● Question & Answer style 2 way communication● M2M communication method composed of text

documents

Page 5: Communicating on the web

THE CLIENTThe client is any application that initiates

an HTTP communication

Page 6: Communicating on the web

THE SERVERServers are any application that receives a request

and terminates with a response

Page 7: Communicating on the web

HTTP IS STATELESS

Page 8: Communicating on the web

STATELESS IS THE OPPOSITE OF

STATEFUL● Stateless, in this context, is short term memory ● Stateless communication allows for

○ distributed system○ load balancing○ manage state separately

● Makes caching more difficult● Makes real time apps more difficult● Application is responsible for preserving state

Page 9: Communicating on the web

SHORT/LONG POLLING● Used to update client side application state in

“real time” applications● Usually initiated by JavaScript● Can be initiated by any client side technology

like Objective C.● Short polling initiates short lived connections

to check if state changed● Long polling initiates long lived connections

until state changes

Page 10: Communicating on the web

THE REQUESTGET https://www.google.com/ HTTP/1.1:version: HTTP/1.1:method: GET:scheme: https:host: www.google.comuser-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36accept-encoding: gzip,deflate,sdchaccept-language: en-US,en;q=0.8,es-419;q=0.6,es;q=0.4accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8cookie: OGP=-3904011:; HSID=A0hmwhHriSEJzPSI; SSID=AKHSzv76RXaggJwJ; APISID=PXmCmOabqgrdcm_z/A7eIE7i4enNC0Hn0;

Page 11: Communicating on the web

THE REQUEST● Human readable text document ● Composed of the request, a set of headers, and

an optional content body● Headers are key value pairs separated by a colon

& terminated by a new line ● Headers describe the request and offer additional

metadata

Page 12: Communicating on the web

THE REQUEST LINEGET https://www.google.com/ HTTP/1.1

● The request is the first line of the document● Composed of 3 parts● From the right: HTTP version

○ Let’s the server know which headers it can expect

Page 13: Communicating on the web

THE REQUESTGET https://www.google.com/ HTTP/1.1

● URL (Universal Resource Locator)● Every request is for a resource● Like interacting with a bank teller● Composed of the scheme, the host, the path,

and optionally a query string

http://server/path/?query=string

Page 14: Communicating on the web

THE REQUESTGET https://www.google.com/ HTTP/1.1

● A verb indicating what you would like to do with the resource

● Withdraw money, create a new account, deposit money, or even rob the bank

Page 15: Communicating on the web

COMMON METHODSGET, POST, PUT, DELETE

HEAD, OPTIONS

● Also called verbs● Describe the intent of the request● CRUD is most common● Small subset● Some, like patch, still in draft form

Page 16: Communicating on the web

COMMON HEADERSuser-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36

● Describes the client● Set by the client● Can be changed programmatically● Mozilla/5.0 compatible hold over from

Netscape years

Page 17: Communicating on the web

COMMON HEADERS

accept-encoding: gzip,deflate,sdch

accept-language: en-US,en;q=0.8,es-419;q=0.6,es;q=0.4accept-charset: utf-8accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8

Page 18: Communicating on the web

ACCEPT FAMILY

● Describes the type of content the client can understand

● accept headers is a list of MIME types● ;q= indicates preference level

Page 19: Communicating on the web

COMMON MIME TYPES● text/html● text/css● text/javascript● text/xml● text/plain● application/json● application/rss+xml

● multipart/form-data● image/jpeg● image/gif● image/png● audio/mpeg● video/mpeg● video/x-flv

Page 20: Communicating on the web

COMMON HEADERScookie: SSID=AKHSzv76RXaggJwJ;

● Describes the contents of a cookie file set by a previous connection to the same host

● Used to persist data across HTTP connections

● Stored in files locally or in memory in the client process

Page 21: Communicating on the web

NOT SO COMMON

authorization: Basic QWpIlc2FtZQ==

● Describes login credentials to password protected URLs

● Two methods, Basic and Digest● Digest more secure, but more complicated to

set up● If not included, response is to request a set of

credentials● Best if used in combination with TLS/SSL

Page 22: Communicating on the web

NOT SO COMMONx-hello: world

hello: world

● x- used to describe a custom header● Deprecated by one of the latest RFCs● Still used by some APIs● New form is not to use the x-● Future proof

Page 23: Communicating on the web

REQUEST BODYContent-Type: multipart/form-data; boundary=AaB03x

--AaB03x Content-Disposition: form-data; name="submit-name"

Larry --AaB03x Content-Disposition: form-data; name="files"; filename="file1.txt" Content-Type: text/plain

... contents of file1.txt ... --AaB03x--

Page 24: Communicating on the web

REQUEST BODY● Optional content for POST, PUT, etc requests● Typically used to send data from HTML forms● Form data formatted as key value pairs with no

boundary● Multipart is most complicated● Form data is separated by boundaries &

terminated by the boundary plus --● File uploads need to be done with multipart● Content-Type is a MIME type describing the

contents of the file● Could be base64 representation of binary data

Page 25: Communicating on the web

THE RESPONSEHTTP/1.1 200 OKstatus: 200 OKversion: HTTP/1.1content-encoding: gzipcontent-type: text/html; charset=UTF-8date: Wed, 20 Nov 2013 01:48:58 GMTset-cookie: PREF=ID=26af7b02617ef537:U=9bc26b9e4; expires=Fri, 20-Nov-2015 01:48:58 GMT; path=/; domain=.google.com

Page 26: Communicating on the web

COMMON HEADERScontent-encoding: gzipcontent-type: text/html; charset=UTF-8

● The content body can be anything from binary, to json, to html

● The content returned is described by the content-type & content enconding

● Related to the accept-header

Page 27: Communicating on the web

COMMON HEADERSset-cookie: PREF=ID=26af7b02617ef537:U=9bc26b9e4; expires=Fri, 20-Nov-2015 01:48:58 GMT; path=/; domain=.google.com

● Sets or overrides a cookie in the client’s system● Cookie content● Optional expiration date● Path & Domain cookie applies to● Localhost is not a valid domain. When testing it’s

preferable not to set the domain

Page 28: Communicating on the web

THE RESPONSEHTTP/1.1 200 OK

● Only thing required to be sent back● Sometimes the only thing sent back● Apache always sends back all the SHOULD

headers

Page 29: Communicating on the web

STATUS CODES200 OK, 404 NOT FOUND, 500 INTERNAL SERVER ERROR

Page 30: Communicating on the web

STATUS CODE FAMILIES● 1xx: Informational Messages● 2xx: Success Messages● 3xx: Redirection Messages● 4xx: Client Error● 5xx: Server Error

● Specific codes convey specific messages● Sometimes sending the status code is enough

to communicate a message

Page 31: Communicating on the web

1XX STATUS CODES● 100 CONTINUE● 101 SWITCHING PROTOCOL

● Not very common● Perfect for use with polling techniques for

asynchronous tasks

Page 32: Communicating on the web

2XX STATUS CODES● 201 CREATED● 202 ACCEPTED

Page 33: Communicating on the web

3XX STATUS CODES● 301 MOVED PERMANENTLY● 302 FOUND● 304 NOT MODIFIED● 305 USE PROXY

Page 34: Communicating on the web

4XX STATUS CODES● 401 NOT AUTHORIZED● 402 PAYMENT REQUIRED● 403 FORBIDDEN● 429 TOO MANY REQUESTS

Page 35: Communicating on the web

5XX STATUS CODES● 501 NOT IMPLEMENTED● 502 BAD GATEWAY● 503 SERVICE UNAVAILABLE

Page 36: Communicating on the web

NOT JUST STANDARD418 & 420

● 418 is I AM A TEAPOT, IETF April Fool’s Joke

● 420 used by Twitter for a while to indicate too many connections

Page 37: Communicating on the web

WHY DOES ANYOF IT MATTER?

Page 38: Communicating on the web

FORMS

● POST request are marginally more secure, but not really

● Requests that carry content can carry more content on the body than on the query string

● Forms can send both query strings and content

● Can submit forms through XMLHTTPRequests with extra headers

Page 39: Communicating on the web

BETTER SECURITY● Use of Auth headers● Use of custom headers

○ Server can reply with CSRF Tokens○ Client can send OAuth Tokens

● Still not as secure as using SSL, but better than nothing at all.

Page 40: Communicating on the web

APIs● Not just about HyperMedia, all is

important● Well documented● URLs that point to actual resources● Use of Request methods & Headers● Use of proper Response codes● Standard communication without

vendor sponsorship

Page 41: Communicating on the web

WHAT WE LEFT OUT● Caching● Proxies● Load balancing● TLS

Page 42: Communicating on the web

THE FUTURE● New RFCs and specifications

○ Patch method○ New status codes○ HTTP 2.0

● SPDY○ Experimental protocol for a faster web○ Pronounced speedy○ Implementation before standardization○ claims of 64% page load reduction over

HTTP in lab tests○ Many concurrent connections over one TCP

channel

Page 43: Communicating on the web

RESOURCES● http://net.tutsplus.com/tutorials/tools-and-tips/http-the-

protocol-every-web-developer-must-know-part-1/● http://net.tutsplus.com/sessions/http-succinctly/● http://en.wikipedia.

org/wiki/List_of_HTTP_status_codes#1xx_Informational● http://en.wikipedia.org/wiki/Internet_media_type● http://www.nczonline.net/blog/2009/05/05/http-cookies-

explained/● http://www.chromium.org/spdy/spdy-whitepaper● http://http2.github.io/● http://xkcd.com/869/● http://blog.servergrove.com/2013/12/16/talking-http/

Page 44: Communicating on the web

THANK YOU

http://joind.in/10392