GET /Connected: A Tutorial on Web-based Services

265
GET Connected A Tutorial on Web-based Services Jim Webber http://jim.webber.name

Transcript of GET /Connected: A Tutorial on Web-based Services

Page 1: GET /Connected: A Tutorial on Web-based Services

GET Connected A Tutorial on Web-based Services Jim Webber

http://jim.webber.name

Page 2: GET /Connected: A Tutorial on Web-based Services

whoami

•  PhD in parallel computing –  Programming language design

•  {developer, architect, director} with ThoughtWorks

•  Author of “Developing Enterprise Web Services” –  And currently engaged in writing a book on

Web-as-middleware

Page 3: GET /Connected: A Tutorial on Web-based Services

Roadmap

•  Motivation and Introduction •  Web Architecture •  RPC Again! •  Embracing HTTP as an application protocol •  Semantics •  Hypermedia •  RESTful services •  Scalability •  Atom and AtomPub •  Security •  WS-* Wars •  Conclusions and further thoughts

Page 4: GET /Connected: A Tutorial on Web-based Services

Introduction

•  This is a tutorial about the Web •  It’s very HTTP centric •  But it’s not about Web pages! •  The Web is a middleware platform which is…

–  Globally deployed –  Has reach –  Is mature –  And is a reality in every part of our lives

•  Which makes it interesting for distributed systems geeks

Page 5: GET /Connected: A Tutorial on Web-based Services

Why Web? Why not REST? •  REST is a brilliant

architectural style •  But the Web allows for more

than just RESTful systems •  There’s a spectrum of

maturity of service styles –  From completely bonkers to

completely RESTful

•  We’ll use the Richardson maturity model to frame these kinds of discussions –  Level 0 to Level 3 –  Web-ignorant to RESTful!

Page 6: GET /Connected: A Tutorial on Web-based Services

Motivation

•  This follows the plot from a book called GET /Connected which is currently being written by: –  Jim Webber –  Savas Parastatidis –  Ian Robinson

•  With help from lots of other lovely people like: –  Halvard Skogsrud, Steve Vinoski, Mark Nottingham, Colin Jack,

Spiros Tzavellas, Glen Ford, Sriram Narayan, Ken Kolchier and many more!

•  The book deals with the Web as a distributed computing platform –  The Web as a whole, not just REST

•  And so does this tutorial…

Page 7: GET /Connected: A Tutorial on Web-based Services

Web Architecture

Page 8: GET /Connected: A Tutorial on Web-based Services

Web History

•  Started as a distributed hypermedia platform –  CERN, Berners-Lee, 1990

•  Revolutionised hypermedia –  Imagine emailing someone a hypermedia deck

nowadays! •  Architecture of the Web largely fortuitous

–  W3C and others have since retrofitted/captured the Web’s architectural characteristics

Page 9: GET /Connected: A Tutorial on Web-based Services

The Web broke the rules

Page 10: GET /Connected: A Tutorial on Web-based Services

Web Fundamentals

•  To embrace the Web, we need to understand how it works

•  The Web is a distributed hypermedia model –  It doesn’t try to hide that distribution from

you! • Our challenge:

–  Figure out the mapping between our problem domain and the underlying Web platform

Page 11: GET /Connected: A Tutorial on Web-based Services

Key Actors in the Web Architecture Client

Cache

Router

Firewall

ISP

Proxy Server

Firewall

Web Server

Resources

Firewall

Web Server

Reverse Proxy

Resources

Page 12: GET /Connected: A Tutorial on Web-based Services

Resources

•  A resource is something “interesting” in your system

•  Can be anything –  Spreadsheet (or one of its cells) –  Blog posting –  Printer –  Winning lottery numbers –  A transaction –  Others?

Page 13: GET /Connected: A Tutorial on Web-based Services

Interacting with Resources

•  We deal with representations of resources –  Not the resources themselves

• “Pass-by-value” semantics –  Representation can be in any format

• Any media type

•  Each resource implements a standard uniform interface –  Typically the HTTP interface

•  Resources have names and addresses (URIs) –  Typically HTTP URIs (aka URLs)

Page 14: GET /Connected: A Tutorial on Web-based Services

Resource Architecture

Physical Resources

Logical Resources

Uniform Interface (Web Server)

Resource Representation (e.g. XML document)

Consumer (Web Client)

Page 15: GET /Connected: A Tutorial on Web-based Services

Resource Representations

•  Making your system Web-friendly increases its surface area –  You expose many resources, rather than fewer

endpoints •  Each resource has one or more

representations –  Representations like JSON or XML or good for the

programmatic Web •  Moving representations across the network is

the way we transact work in a Web-native system

Page 16: GET /Connected: A Tutorial on Web-based Services

URIs

• URIs are addresses of resources in Web-based systems –  Each resource has at least one URI

•  They identify “interesting” things –  i.e. Resources

•  Any resource implements the same (uniform) interface –  Which means we can access it

programmatically!

Page 17: GET /Connected: A Tutorial on Web-based Services

URI/Resource Relationship

•  Any two resources cannot be identical –  Because then you’ve only got one resource!

•  But they can have more than one name –  http://foo.com/software/latest –  http://foo.com/software/v1.4

•  No mechanism for URI equality •  Canonical URIs are long-lived

–  E.g. http://example.com/versions/1.1 versus http://example.com/versions/latest

–  Send back HTTP 303 (“see also”) if the request is for an alternate URI

–  Or set the Content-Location header in the response

Page 18: GET /Connected: A Tutorial on Web-based Services

Web Characteristics

•  Scalable •  Fault-tolerant •  Recoverable •  Secure •  Loosely coupled

•  Precisely the same characteristics we want in business software systems!

Page 19: GET /Connected: A Tutorial on Web-based Services

Scalability

•  Web is truly Internet-scale –  Loose coupling

• Growth of the Web in one place is not impacted by changes in other places

–  Uniform interface • HTTP defines a standard interface for all actors on the

Web • Replication and caching is baked into this model

–  Caches have the same interface as real resources!

–  Stateless model •  Supports horizontal scaling

Page 20: GET /Connected: A Tutorial on Web-based Services

Fault Tolerant

•  The Web is stateless –  All information required to process a request

must be present in that request • Sessions are still plausible, but must be handled

in a Web-consistent manner –  Modelled as resources!

•  Statelessness means easy replication –  One Web server is replaceable with another –  Easy fail-over, horizontal scaling

Page 21: GET /Connected: A Tutorial on Web-based Services

Recoverable

•  The Web places emphasis on repeatable information retrieval –  GET is idempotent

•  Library of Congress found this the hard way! –  In failure cases, can safely repeat GET on

resources •  HTTP verbs plus rich error handling help to

remove guesswork from recovery –  HTTP statuses tell you what happened! –  Some verbs (e.g. PUT, DELETE) are safe to

repeat

Page 22: GET /Connected: A Tutorial on Web-based Services

Secure

•  HTTPs is a mature technology –  Based on SSL for secure point-to-point

information retrieval •  Isn’t sympathetic to Web architecture

–  Can’t cache! •  Higher-order protocols like Atom are

starting to change this... –  Encrypt parts of a resource representation, not

the transport channel –  OK to cache!

Page 23: GET /Connected: A Tutorial on Web-based Services

Loosely Coupled

•  Adding a Web site to the WWW does not affect any other existing sites

•  All Web actors support the same, uniform interface –  Easy to plumb new actors into the big wide

web • Caches, proxies, servers, resources, etc

Page 24: GET /Connected: A Tutorial on Web-based Services

RPC Again!

Page 25: GET /Connected: A Tutorial on Web-based Services

Web Tunnelling

• Web Services tunnel SOAP over HTTP –  Using the Web as a transport only –  Ignoring many of the features for robustness

the Web has built in

• Many Web people doe the same! –  URI tunnelling, POX approaches are the most

popular styles on today’s Web –  Worse than SOAP!

• Less metadata!

But they claim to be “lightweight”

and RESTful

Page 26: GET /Connected: A Tutorial on Web-based Services

Richardson Model Level 1 •  Lots of URIs

–  But really has a more level 0 mindset

•  Doesn’t understand HTTP –  Other than as a transport

•  No hypermedia

Page 27: GET /Connected: A Tutorial on Web-based Services

URI Tunnelling Pattern

•  Web servers understand URIs •  URIs have structure •  Methods have signatures •  Can match URI structure to method signature

Page 28: GET /Connected: A Tutorial on Web-based Services

On The Wire

Page 29: GET /Connected: A Tutorial on Web-based Services

Server-Side URI Tunnelling Example

public void ProcessGet(HttpListenerContext context) { // Parse the URI Order order = ParseUriForOrderDetails(context.Request.QueryString);

string response = string.Empty;

if (order != null) { // Process the order by calling the mapped method var orderConfirmation = RestbucksService.PlaceOrder(order);

response = "OrderId=" + orderConfirmation.OrderId.ToString(); } else { response = "Failure: Could not place order."; }

// Write to the response stream using (var sw = new StreamWriter(context.Response.OutputStream)) { sw.Write(response); } }

Page 30: GET /Connected: A Tutorial on Web-based Services

Client-Side URI Tunnelling public OrderConfirmation PlaceOrder(Order order) { // Create the URI var sb = new StringBuilder("http://restbucks.com/PlaceOrder?");

sb.AppendFormat("coffee={0}", order.Coffee.ToString()); sb.AppendFormat("&size={0}", order.Size.ToString()); sb.AppendFormat("&milk={0}", order.Milk.ToString()); sb.AppendFormat("&consume-location={0}", order.ConsumeLocation.ToString());

// Set up the GET request var request = HttpRequest.Create(sb.ToString()) as HttpWebRequest; request.Method = "GET";

// Get the response var response = request.GetResponse();

// Read the contents of the response OrderConfirmation orderConfirmation = null; using (var sr = new StreamReader(response.GetResponseStream())) { var str = sr.ReadToEnd();

// Create an OrderConfirmation object from the response orderConfirmation = new OrderConfirmation(str); } return orderConfirmation; }

Page 31: GET /Connected: A Tutorial on Web-based Services

URI Tunnelling Strengths

•  Very easy to understand • Great for simple procedure-calls •  Simple to code

–  Do it with the servlet API, HttpListener, IHttpHandler, RAILS, whatever!

•  Interoperable –  It’s just URIs!

Page 32: GET /Connected: A Tutorial on Web-based Services

URI Tunnelling Weaknesses

•  It’s brittle RPC! •  Tight coupling, no metadata

–  No typing or “return values” specified in the URI •  Not robust – have to handle failure cases

manually •  No metadata support

–  Construct the URIs yourself, map them to the function manually

•  You typically use GET (prefer POST) –  OK for functions, but against the Web for procedures

with side-affects

Page 33: GET /Connected: A Tutorial on Web-based Services

POX Pattern

• Web servers understand how to process requests with bodies –  Because they understand forms

•  And how to respond with a body –  Because that’s how the Web works

•  POX uses XML in the HTTP request and response to move a call stack between client and server

Page 34: GET /Connected: A Tutorial on Web-based Services

Richardson Model Level 0 •  Single well-known endpoint

–  Not really URI friendly

•  Doesn’t understand HTTP –  Other than as a transport

•  No hypermedia

Page 35: GET /Connected: A Tutorial on Web-based Services

POX Architecture

Page 36: GET /Connected: A Tutorial on Web-based Services

POX on the Wire

Page 37: GET /Connected: A Tutorial on Web-based Services

.Net POX Service Example private void ProcessRequest(HttpListenerContext context) { string verb = context.Request.HttpMethod.ToLower().Trim(); switch (verb) { case "post": { // Everything's done with post in this case XmlDocument request = new XmlDocument(); request.Load(XmlReader.Create(context.Request.InputStream));

XmlElement result = MyApp.Process(request.DocumentElement); byte[] returnValue = Utils.ConvertUnicodeString(Constants.Xml.XML_DECLARATION + result.OuterXml);

context.Response.OutputStream.Write(returnValue, 0, returnValue.Length);

break; } ...

From the Web server

Check HTTP Verb (we want POST)

Turn the HTTP body into an XML document for

processing

Dispatch it for processing

Get XML result, and get bytes

Return XML bytes to client

Page 38: GET /Connected: A Tutorial on Web-based Services

Java POX Servlet public class RestbucksService extends HttpServlet {

@Override protected void doPost(HttpServletRequest request, HttpServletResponse response)

throws ServletException, IOException {

// Initialization code omitted for brevity try {

requestReader = request.getReader(); responseWriter = response.getWriter();

String xmlRequest = extractPayload(requestReader);

Order order = createOrder(xmlRequest);

OrderConfirmation confirmation = restbucksService.placeOrder(order);

embedPayload(requestWriter, confirmation.toString());

} finally { // Cleanup code omitted for brevity }

}

Page 39: GET /Connected: A Tutorial on Web-based Services

C# POX Client Example public OrderConfirmation PlaceOrder(string customerId, Item[] items) { // Serialize our objects XmlDocument requestXml = CreateXmlRequest(customerId, items); var client = new WebClient();

var ms = new MemoryStream(); requestXml.Save(ms);

client.Headers.Add("Content-Type", "application/xml");

ms = new MemoryStream(client.UploadData("http://restbucks.com/PlaceOrder", null, ms.ToArray()));

var responseXml = new XmlDocument(); responseXml.Load(ms); return CreateOrderConfirmation(responseXml); }

Page 40: GET /Connected: A Tutorial on Web-based Services

Java Apache Commons Client public class OrderingClient {

private static final String XML_HEADING = "<?xml version=\"1.0\"?>\n";

private static final String NO_RESPONSE = "Error: No response.";

public String placeOrder(String customerId, String[] itemIds)

throws Exception {

// XML string creation omitted for brevity // ...

String response = sendRequestPost(request, "http://restbucks.com/PlaceOrder");

Document xmlResponse = DocumentBuilderFactory.newInstance()

.newDocumentBuilder().parse(

new InputSource(new StringReader(response)));

// XML response handling omitted for brevity }

private String sendRequestPost(String request, String uri)

throws IOException, HttpException {

PostMethod method = new PostMethod(uri);

method.setRequestHeader("Content-type", "application/xml");

method.setRequestBody(XML_HEADING + request);

String responseBody = NO_RESPONSE; try { new

HttpClient().executeMethod(method);

responseBody = new String(method.getResponseBody(), "UTF-8");

} finally { method.releaseConnection(); }

return responseBody; } }

Page 41: GET /Connected: A Tutorial on Web-based Services

POX Strengths

•  Simplicity – just use HTTP POST and XML •  Re-use existing infrastructure and

libraries •  Interoperable

–  It’s just XML and HTTP

•  Can use complex data structures –  By encoding them in XML

Page 42: GET /Connected: A Tutorial on Web-based Services

POX Weaknesses

•  Client and server must collude on XML payload –  Tightly coupled approach

•  No metadata support –  Unless you’re using a POX toolkit that supports

WSDL with HTTP binding (like WCF) •  Does not use Web for robustness •  Does not use SOAP + WS-* for robustness

either

Page 43: GET /Connected: A Tutorial on Web-based Services

Web Abuse

•  Both POX and URI Tunnelling fail to take advantage of the Web –  Ignoring status codes –  Reduced scope for caching –  No metadata –  Manual crash recovery/compensation leading to high

development cost –  Etc

•  They’re useful in some situations –  And you can implement them with minimal toolkit

support –  But they’re not especially robust patterns

Page 44: GET /Connected: A Tutorial on Web-based Services

Tech Interlude HTTP Fundamentals

Page 45: GET /Connected: A Tutorial on Web-based Services

The HTTP Verbs •  Retrieve a representation of a resource:

GET •  Create a new resource: PUT to a new URI,

or POST to an existing URI •  Modify an existing resource: PUT to an

existing URI •  Delete an existing resource: DELETE •  Get metadata about an existing resource:

HEAD •  See which of the verbs the resource

understands: OPTIONS

Decreasing likelihood of being understood

by a Web server today

Page 46: GET /Connected: A Tutorial on Web-based Services

HTTP Status Codes

•  The HTTP status codes provide metadata about the state of resources

•  They are part of what makes the Web a rich platform for building distributed systems

•  They cover five broad categories –  1xx - Metadata –  2xx – Everything’s fine –  3xx – Redirection –  4xx – Client did something wrong –  5xx – Server did a bad thing

•  There are a handful of these codes that we need to know in more detail

Page 47: GET /Connected: A Tutorial on Web-based Services

1xx

•  100 – Continue –  The operation will be accepted by the

service –  The “look before you leap” pattern

• Use in with the Expect header

Page 48: GET /Connected: A Tutorial on Web-based Services

2xx

•  200 – OK –  The server successfully completed whatever the client

asked of it •  201 – Created

–  Sent when a new resource is created at the client’s request via POST

–  Location header should contain the URI to the newly created resource

•  202 – Accepted –  Client’s request can’t be handled in a timely manner –  Location header should contain a URI to the resource

that will eventually be exposed to fulfil the client’s expectations

Page 49: GET /Connected: A Tutorial on Web-based Services

More 2xx Codes

•  203 – Non-Authoritative Information –  Much like 200, except the client knows not to

place full trust in any headers since they could have come from 3rd parties or be cached etc.

•  204 – No Content –  The server declines to send back a

representation • Perhaps because the associated resource doesn’t have

one –  Used like an “ack”

• Prominent in AJAX applications

Page 50: GET /Connected: A Tutorial on Web-based Services

3xx

•  301 – Moved Permanently – Location header contains the new location

of the resource

•  303 – See Other – Location header contains the location of an

alternative resource –  Used for redirection

Page 51: GET /Connected: A Tutorial on Web-based Services

More 3xx

•  304 – Not Modified –  The resource hasn’t changed, use the existing

representation –  Used in conjunction with conditional GET –  Client sends the If-Modified-Since header –  Response Date header must be set –  Response Etag and Content-Location

headers must be same as original representation

Page 52: GET /Connected: A Tutorial on Web-based Services

4xx

•  400 – Bad Request –  The client has PUT or POST a resource

representation that is in the right format, but contains invalid information

•  401 – Unauthorised –  Proper credentials to operate on a resource weren’t

provided –  Response WWW-Authenticate header contains the

type of authentication the server expects •  Basic, digest, WSSE, etc

–  Don’t leak information! •  Consider 404 in these situations

Page 53: GET /Connected: A Tutorial on Web-based Services

More 4xx

•  403 – Forbidden –  The client request is OK, but the server doesn’t

want to process it • E.g. Restricted by IP address

–  Implies that resource exists, beware leaking information

•  404 – Not Found –  The standard catch-all response –  May be a lie to prevent 401 or 403 information

leakage

Page 54: GET /Connected: A Tutorial on Web-based Services

Even more 4xx

•  405 – Method Not Allowed –  The resource doesn’t support a given method –  The response Allow header lists the verbs the

resource understands • E.g. Allow: GET, POST, PUT

•  406 – Not Acceptable –  The client places too many restrictions on the

resource representation via the Accept-* header in the request

–  The server can’t satisfy any of those representations

Page 55: GET /Connected: A Tutorial on Web-based Services

Yet More 4xx

•  409 – Conflict –  Tried to change the state of the resource to

something the server won’t allow • E.g. Trying to DELETE something that doesn’t exist

•  410 – Gone –  The resource has gone, permanently. –  Don’t send in response to DELETE

• The client won’t know if it was deleted, or if it was gone and the delete failed

Page 56: GET /Connected: A Tutorial on Web-based Services

Still more 4xx

•  412 – Precondition Failed –  Server/resource couldn’t meet one or more

preconditions • As specified in the request header

–  E.g. Using If-Unmodified-Since and PUT to modify a resource provided it hasn’t been changed by others

•  413 – Request Entity Too Large –  Response comes with the Retry-After header

in the hope that the failure is transient

Page 57: GET /Connected: A Tutorial on Web-based Services

5xx Codes

•  500 – Internal Server Error –  The normal response when we’re lazy

•  503 – Service Unavailable –  The HTTP server is up, but not supporting

resource communication properly –  Server may send a Retry-After header,

assuming the fault is transient

Page 58: GET /Connected: A Tutorial on Web-based Services

HTTP Headers

• Headers provide metadata to assist processing –  Identify resource representation format

(media type), length of payload, supported verbs, etc

• HTTP defines a wealth of these –  And like status codes they are our building

blocks for robust service implementations

Page 59: GET /Connected: A Tutorial on Web-based Services

Must-know Headers

•  Authorizaton –  Contains credentials (basic, digest, WSSE,

etc) –  Extensible

•  Content-Length –  Length of payload, in bytes

•  Content-Type –  The resource representation form

• E.g. application/xml, application/xhtml+xml

Page 60: GET /Connected: A Tutorial on Web-based Services

More Must-Know Headers

•  Etag/If-None-Match –  Opaque identifier – think “checksum” for

resource representations –  Used for conditional actions

•  If-Modified-Since/Last-Modified –  Used for conditional operations too

• Host –  Contains the domain-name part of the URI

Page 61: GET /Connected: A Tutorial on Web-based Services

Yet More Must-Know Headers

•  Location –  Used to flag the location of a created/moved

resource –  In combination with:

• 201 Created, 301 Moved Permanently, 302 Found, 307 Temporary Redirect, 300 Multiple Choices, 303 See Other

•  User-Agent –  Tells the server side what the client-side

capabilities are –  Should not be used in the programmable Web!

Page 62: GET /Connected: A Tutorial on Web-based Services

Final Must-Know Headers

• WWW-Authenticate –  Used with 401 status –  Informs client what authentication is needed

• Date –  Mandatory! –  Timestamps on request and response

Page 63: GET /Connected: A Tutorial on Web-based Services

Useful Headers

•  Accept –  Client tells server what formats it wants –  Can externalise this in URI names in the

general case

Page 64: GET /Connected: A Tutorial on Web-based Services

More Useful Headers

•  Cache-Control –  Metadata for caches, tells them how to

cache (or not) the resource representation –  And for how long etc.

•  Content-MD5 –  Cryptographic checksum of body –  Useful integrity check, has computation cost

Page 65: GET /Connected: A Tutorial on Web-based Services

Yet More Useful Headers

•  Expect –  A conditional – client asks if it’s OK to proceed by

expecting 100-Continue –  Server either responds with 100 or 417 – Expectation

Failed •  Expires

–  Server tells client or proxy server that representation can be safely cached until a certain time

•  If-Match –  Used for ETag comparison –  Opposite of If-None-Match

Page 66: GET /Connected: A Tutorial on Web-based Services

Final Useful Headers

•  If-Unmodified-Since –  Useful for conditional PUT/POST

• Make sure the resource hasn’t changed while you’re been manipulating it

–  Compare with If-Modified-Since •  Range

–  Specify part of a resource representation (a byte range) that you need – aka partial GET

–  Useful for failure/recovery scenarios

Page 67: GET /Connected: A Tutorial on Web-based Services

HTTP RCF 2616 is Authoritative

•  The statuses and headers here are a sample of the full range of headers in the HTTP spec

•  They spec contains more than we discuss here

•  It is authoritative about usage •  And it’s a good thing to keep handy when

you’re working on a Web-based distributed system!

Page 68: GET /Connected: A Tutorial on Web-based Services

Tech Interlude URI Templates

Page 69: GET /Connected: A Tutorial on Web-based Services

Conflicting URI Philosophies

•  URIs should be descriptive, predictable? –  http://spreadsheet/cells/a2,a9 –  http://jim.webber.name/2007/06.aspx

•  Convey some ideas about how the underlying resources are arranged

•  Can infer http://spreadsheet/cells/b0,b10 and http://jim.webber.name/2005/05.aspx for example

•  Nice for programmatic access, but may introduce coupling

•  URIs should be opaque? –  http://tinyurl.com/6vfs6 –  TimBL says “opque URIs are cool” –  Convey no semantics, can’t infer anything from them

•  Don’t introduce coupling

Page 70: GET /Connected: A Tutorial on Web-based Services

URI Templates, in brief

• Use URI templates to make your resource structure easy to understand

•  For Amazon S3 (storage service) it’s easy: –  http://s3.amazon.com/{bucket-name}/{object-name}

Bucket1

Object1

Object2

Object3

Bucket2

Object1

Object2

Page 71: GET /Connected: A Tutorial on Web-based Services

URI Templates are Easy!

•  Take the URI: http://restbucks.com/orders?{order_id}

•  You could do the substitution and get a URI: http://restbucks.com/orders?1234 •  Can easily make more complex URIs too

–  Mixing template and non-template sections http://restbucks.com/{orders}/{shop}/{year}/{month}.xml

•  Use URI templates client-side to compute server-side URIs –  But beware introducing coupling!

Page 72: GET /Connected: A Tutorial on Web-based Services

Why URI Templates?

•  Regular URIs are a good idiom in Web-based services –  Helps with understanding, self

documentation

•  They allow users to infer a URI –  If the pattern is regular

• URI templates formalise this arrangement –  And advertise a template rather than a

regular URI

Page 73: GET /Connected: A Tutorial on Web-based Services

URI Templates Pros and Cons

•  Everything interesting in a Web-based service has a URI •  Remember, two schools of thought:

–  Opaque URIs are cool (Berners-Lee) –  Transparent URIs are cool (everyone else)

•  URI templates present two core concerns: –  They invite clients to invent URIs which may not be honoured

by the server –  They increase coupling since servers must honour forever any

URI templates they’ve advertised •  Use URI templates sparingly, and with caution

–  Entry point URIs only is a good rule of thumb –  Or use them RESTfully, as we shall discuss later

Page 74: GET /Connected: A Tutorial on Web-based Services

Embracing HTTP as an Application Protocol

Page 75: GET /Connected: A Tutorial on Web-based Services

Using the Web

•  URI tunnelling and POX use the Web as a transport –  Just like SOAP without metadata support

•  CRUD services begin to use the Web’s coordination support

•  But the Web is more than transport –  Transport, plus –  Metadata, plus –  Fault model, plus –  Component model, plus –  Runtime environment, plus...

HTTP Headers Status Codes

Uniform Interface Caches,

proxies, servers, etc

Page 76: GET /Connected: A Tutorial on Web-based Services

CRUD Resource Lifecycle

•  The resource is created with POST •  It’s read with GET •  And updated via PUT •  Finally it’s removed using DELETE

Page 77: GET /Connected: A Tutorial on Web-based Services

Richardson Model Level 2 •  Lots of URIs •  Understands HTTP! •  No hypermedia

Page 78: GET /Connected: A Tutorial on Web-based Services

Ordering Client

POST /orders <order … />

201 Created Location: …/1234

400 Bad Request

500 Internal Error

Ordering Service

Create with POST

Page 79: GET /Connected: A Tutorial on Web-based Services

POST Semantics

•  POST creates a new resource •  But the server decides on that resource’s

URI •  Common human Web example: posting to

Web log –  Server decides URI of posting and any comments

made on that post •  Programmatic Web example: creating a new

employee record –  And subsequently adding to it

Page 80: GET /Connected: A Tutorial on Web-based Services

POST Request POST /orders HTTP/1.1 Host: restbucks.example.com Content-Type: application/xml Content-Length: 225

<order xmlns="http://schemas.restbucks.com/order"> <location>takeAway</location> <items> <item> <name>latte</name> <quantity>1</quantity> <milk>whole</milk> <size>small</size> </item> </items> </order>

Verb, path, and HTTP version

Generic XML content

Content (again Restbucks XML)

Page 81: GET /Connected: A Tutorial on Web-based Services

POST Response

HTTP/1.1 201 Created Location: /orders/1234

Page 82: GET /Connected: A Tutorial on Web-based Services

When POST goes wrong

•  We may be 4xx or 5xx errors –  Client versus server problem

•  We turn to GET! •  Find out the resource states first

–  Then figure out how to make forward or backward progress

•  Then solve the problem –  May involve POSTing again –  May involve a PUT to rectify server-side

resources in-place

Page 83: GET /Connected: A Tutorial on Web-based Services

POST Implementation with a Servlet

protected void doPost(HttpServletRequest request, HttpServletResponse response) {

try { Order order = extractOrderFromRequest(request); String internalOrderId = OrderDatabase.getDatabase().saveOrder(order); response.setHeader("Location", computeLocationHeader(request, internalOrderId)); response.setStatus(HttpServletResponse.SC_CREATED);

} catch(Exception ex) { response.setStatus(HttpServletResponse.SC_INTERNAL_SERVER_ERROR); } }

Page 84: GET /Connected: A Tutorial on Web-based Services

Ordering Client

GET /orders/1234

200 OK <order … />

404 Not Found

500 Internal Error

Ordering Service

Read with GET

Page 85: GET /Connected: A Tutorial on Web-based Services

GET Semantics

• GET retrieves the representation of a resource

•  Should be idempotent –  Shared understanding of GET semantics –  Don’t violate that understanding!

Library of congress catalogue incident!

Page 86: GET /Connected: A Tutorial on Web-based Services

GET Exemplified

GET /orders/1234 HTTP/1.1

Accept: application/vnd.restbucks+xml Host: restbucks.com

Should the expected resource representation be in a header, or

made explicit in the URI?

Page 87: GET /Connected: A Tutorial on Web-based Services

GET Response HTTP/1.1 200 OK Content-Length: 232 Content-Type: application/vnd.restbucks+xml Date: Wed, 19 Nov 2008 21:48:10 GMT

<order xmlns="http://schemas.restbucks.com/order"> <location>takeAway</location> <items> <item>

<name>latte</name> <quantity>1</quantity> <milk>whole</milk> <size>small</size> </item> </items> <status>pending</pending> </order>

Page 88: GET /Connected: A Tutorial on Web-based Services

When GET Goes wrong

•  Simple! –  Just 404 – the resource is no longer available

HTTP/1.1 404 Not Found Date: Sat, 20 Dec 2008 19:01:33 GMT

•  Are you sure? –  GET again!

•  GET is safe and idempotent –  Great for crash recovery scenarios!

Page 89: GET /Connected: A Tutorial on Web-based Services

Idempotent Behaviour

•  A action with no side affects –  Comes from mathematics

•  In practice means two things: –  A safe operation is one which changes no state at all

•  E.g. HTTP GET –  An idempotent operation is one which updates state in an

absolute way •  E.g. x = 4 rather than x += 2

•  Web-friendly systems scale because of safety –  Caching!

•  And are fault tolerant because of idempotent behaviour –  Just re-try in failure cases

Page 90: GET /Connected: A Tutorial on Web-based Services

GET JAX-RS Implementation @Path("/")

public class OrderingService { @GET @Produces("application/vnd.restbucks+xml") @Path("/{orderId}") public String getOrder(@PathParam("orderId") String orderId) {

try { Order order = OrderDatabase.getDatabase().getOrder(orderId); if (order != null) { return xstream.toXML(order); } else { throw new WebApplicationException(404);

} } catch (Exception e) { throw new WebApplicationException(500); } } // Remainder of implementation omitted for brevity

}

Page 91: GET /Connected: A Tutorial on Web-based Services

Ordering Client

PUT /orders/1234 <order … />

200 OK

404 Not Found

409 Conflict

500 Internal Error

Ordering Service

Update with PUT

Page 92: GET /Connected: A Tutorial on Web-based Services

PUT Semantics

•  PUT creates a new resource but the client decides on the URI –  Providing the server logic allows it

•  Also used to update existing resources by overwriting them in-place

•  PUT is idempotent –  Makes absolute changes

•  But is not safe –  It changes state!

Page 93: GET /Connected: A Tutorial on Web-based Services

PUT Request PUT /orders/1234 HTTP/1.1 Host: restbucks.com

Content-Type: application/xml Content-Length: 386

<order xmlns="http://schemas.restbucks.com/order"> <location>takeAway</location> <items> <item>

<milk>whole</milk> <name>latte</name> <quantity>2</quantity> <size>small</size>

</item> <item> <milk>whole</milk> <name>cappuccino</name> <quantity>1</quantity> <size>large</size> </item>

</items> <status>preparing</preparing> </order>

Updated content

Page 94: GET /Connected: A Tutorial on Web-based Services

PUT Response

HTTP/1.1 200 OK

Date: Sun, 30 Nov 2008 21:47:34 GMT Content-Length: 0

Minimalist response contains no entity body

Page 95: GET /Connected: A Tutorial on Web-based Services

When PUT goes wrong •  If we get 5xx error, or some

4xx errors simply PUT again! –  PUT is idempotent

•  If we get errors indicating incompatible states (409, 417) then do some forward/backward compensating work –  And maybe PUT again

HTTP/1.1 409 Conflict Date: Sun, 21 Dec 2008 16:43:07 GMT

Content-Length:382

<order xmlns="http://schemas.restbucks.com/order">

<location>takeAway</location> <items> <item>

<milk>whole</milk> <name>latte</name> <quantity>2</quantity>

<size>small</size> </item> <item> <milk>whole</milk>

<name>cappuccino</name> <quantity>1</quantity> <size>large</size>

</item> </items> <status>served</status> </order>

Page 96: GET /Connected: A Tutorial on Web-based Services

WCF Implementation for PUT

[ServiceContract]

public interface IOrderingService { [OperationContract] [WebInvoke(Method = "PUT", UriTemplate = "/orders/

{orderId}")]

void UpdateOrder(string orderId, Order order);

// … }

Page 97: GET /Connected: A Tutorial on Web-based Services

WCF Serializable Types [DataContract(Namespace = "http://schemas.restbucks.com/order", Name = "order")] public class Order

{

[DataMember(Name = "location")]

public Location ConsumeLocation

{ get { return location; }

set { location = value; }

}

[DataMember(Name = "items")]

public List<Item> Items {

get { return items; }

set { items = value; }

}

[DataMember(Name = "status")] public Status OrderStatus

{

get { return status; }

set { status = value; }

}

// …

}

Page 98: GET /Connected: A Tutorial on Web-based Services

Ordering Client

DELETE /orders/1234

200 OK

404 Not Found

405 Method Not Allowed

500 Service Unavailable

Ordering Service

Remove with DELETE

Page 99: GET /Connected: A Tutorial on Web-based Services

DELETE Semantics

•  Stop the resource from being accessible –  Logical delete, not necessarily physical

•  Request DELETE /orders/1234 HTTP/1.1 Host: restbucks.com

•  Response HTTP/1.1 200 OK Content-Length: 0 Date: Tue, 16 Dec 2008 17:40:11 GMT

This is important for decoupling

implementation details from resources

Page 100: GET /Connected: A Tutorial on Web-based Services

When DELETE goes wrong

•  Simple case, DELETE again! –  Delete is idempotent! –  DELETE once, DELETE 10 times has the same

effect: one deletion

HTTP/1.1 404 Not Found Content-Length: 0

Date: Tue, 16 Dec 2008 17:42:12 GMT

Page 101: GET /Connected: A Tutorial on Web-based Services

When DELETE goes Really Wrong •  Look out for 405 and 409! •  Some 4xx responses indicate

that deletion isn’t possible –  The state of the resource

isn’t compatible –  Try forward/backward

compensation instead

HTTP/1.1 409 Conflict Content-Length: 379

Date: Tue, 16 Dec 2008 17:53:09 GMT

<order xmlns="http://schemas.restbucks.com/order">

<location>takeAway</location> <items> <item>

<name>latte</name> <milk>whole</milk> <size>small</size>

<quantity>2</quantity> </item> <item> <name>cappuccino</name>

<milk>skim</milk> <size>large</size> <quantity>1</quantity>

</item> </items> <status>served</status> </order>

Can’t delete an order that’s

already served

Page 102: GET /Connected: A Tutorial on Web-based Services

CRUD is Good?

•  CRUD is good –  But it’s not great

•  CRUD-style services use some HTTP features •  But the application model is limited

–  Suits database-style applications –  Hence frameworks like Microsoft’s Astoria

•  CRUD has limitations –  CRUD ignores hypermedia –  CRUD encourages tight coupling through URI templates –  CRUD encourages server and client to collude

•  The Web supports more sophisticated patterns than CRUD!

Page 103: GET /Connected: A Tutorial on Web-based Services

Tech Interlude Describing CRUD Services

Page 104: GET /Connected: A Tutorial on Web-based Services

WADL Overview

•  Web Application Description Language –  A contract language for Web-based services

•  Think of a WADL contract as a site map for the programmatic Web –  Or like a resource-oriented WSDL for Web-based services

•  Purpose: help tools generate clients which interact with a known set of resources

•  Suits services with a static set of URIs or URI templates

•  Does not suit hypermedia services •  Which we’ll see later!

Page 105: GET /Connected: A Tutorial on Web-based Services

WADL Close Up <resources base="http://api.search.yahoo.com/NewsSearchService/V1/"> <resource path="newsSearch"> <method name="GET" id="search"> <request> <param name="appid" type="xsd:string" style="query"

required="true"/> <param name="query" type="xsd:string" style="query"

required="true"/> <param name="type" style="query" default="all"> <option value="all"/> <option value="any"/> <option value="phrase"/> </param> ... </request> <response> <representation mediaType="application/xml"

element="yn:ResultSet"/> <fault status="400" mediaType="application/xml"

element="ya:Error"/> </response> </method> </resource>

Base URI

Local resource path – can be a template

Verb

Request declaration Form parameters

Form options

Response declaration XML response

Results or fault

Page 106: GET /Connected: A Tutorial on Web-based Services

A more Programmatic WADL

•  The previous WADL example showed the Yahoo! news service

•  But it looked like a meta-description of the HTML forms

• WADL seems more natural for the programmatic Web when it uses other formats –  XML, JSON, etc

Page 107: GET /Connected: A Tutorial on Web-based Services

WADL and XML Resources <resources base="http://example.org/V1/"> <resource path="{order}"> <method name="PUT"> <request> <representation

mediaType="application/xml" element="r:Order"/>

</request> <response> <representation

mediaType="application/xml" element="r:Order"/>

<fault status="409" mediaType="application/xml" element="r:OrderError"/>

</response> ...

Path is templated

PUT new order

Order in XML document

Order returned as XML if success

Or 409 with helpful XML error message

otherwise

Page 108: GET /Connected: A Tutorial on Web-based Services

WADL Issues

•  Is WADL another WSDL and therefore evil? –  It is static, but the Web is dynamic

• But do we want the programmatic Web to be dynamic? • Or do we want to have constraints around the

resources we interact with?

•  Can WADL avoid being another WSDL –  We could return WADL documents when we

move outside the scope of the original document –  Seems impractical, tightly coupled

•  Can we use existing Web techniques for specifying contracts instead?

Page 109: GET /Connected: A Tutorial on Web-based Services

Tech Interlude Semantics

Page 110: GET /Connected: A Tutorial on Web-based Services

Microformats

•  Microformats are an example of little “s” semantics

•  Innovation at the edges of the Web –  Not by some central design authority (e.g. W3C)

•  Started by embedding machine-processable elements in Web pages –  E.g. Calendar information, contact information,

etc –  Using existing HTML features like class, rel,

etc

Page 111: GET /Connected: A Tutorial on Web-based Services

Semantic versus semantic

•  Semantic Web is top-down –  Driven by the W3C with extensive array of technology,

standards, committees, etc –  Has not currently proven as scalable as the visionaries hoped

•  RDF tripples have been harvested and processed in private databases

•  Microformats are bottom-up –  Little formal organisation, no guarantee of interoperability –  Popular formats tend to be adopted (e.g. hCard) –  Easy to use and extend for our systems –  Trivial to integrate into current and future programmatic Web

systems

Page 112: GET /Connected: A Tutorial on Web-based Services

Microformats and Resources

•  Use Microformats to structure resources where formats exist –  I.e. Use hCard for contacts, hCalendar for data

•  Create your own formats (sparingly) in other places –  Annotating links is a good start –  <link rel="withdraw.cash" .../> –  <link rel="service.post" type="application/atom+xml" href="{post-uri}" title="some title">

•  The rel attribute describes the semantics of the referred resource

Page 113: GET /Connected: A Tutorial on Web-based Services

Tech Interlude Hypermedia Formats

Page 114: GET /Connected: A Tutorial on Web-based Services

Media Types Rule!

•  WADL takes an enterprise-y approach to the Web –  Make it static, dumb it down, sell tools to process it

•  This is not how the Web works! •  The Web’s contracts are expressed in terms of

media types –  If you know the type, you can process the content

•  Some types are special because they work in harmony with the Web –  We call these “hypermedia formats”

Page 115: GET /Connected: A Tutorial on Web-based Services

Other Resource Representations

•  Remember, XML is not the only way a resource can be serialised –  Remember the Web is based on REpresentational State

Transfer •  The choice of representation is left to the implementer

–  Can be a standard registered media type –  Or something else

•  But there is a division on the Web between two families –  Hypermedia formats

•  Formats which host URIs and links –  Regular formats

•  Which don’t

Page 116: GET /Connected: A Tutorial on Web-based Services

Plain Old XML is not Hypermedia Friendly

HTTP/1.1 200 OK Content-Length: 227 Content-Type: application/xml Date: Wed, 19 Nov 2008 21:48:10 GMT

<order xmlns="http://schemas.restbucks.com/order"> <location>takeAway</location> <items> <item>

<name>latte</name> <quantity>1</quantity> <milk>whole</milk> <size>small</size> </item> </items> <status>pending</pending> </order>

Where are the links? Where’s the protocol?

Page 117: GET /Connected: A Tutorial on Web-based Services

So what?

• How do you know the next thing to do? • How do you know the resources you’re

meant to interact with next? •  In short, how do you know the service’s

protocol? –  Turn to WADL? Yuck! –  Read the documentation? Come on! –  URI Templates? Tight Coupling!

Page 118: GET /Connected: A Tutorial on Web-based Services

URI Templates are NOT a Hypermedia Substitute

•  Often URI templates are used to advertise all resources a service hosts –  Do we really need to advertise them all?

•  This is verbose •  This is out-of-band communication •  This encourages tight-coupling to resources through their URI

template •  This has the opportunity to cause trouble!

–  Knowledge of “deep” URIs is baked into consuming programs –  Services encapsulation is weak and consumers will program to

it –  Service will change its implementation and break consumers

Page 119: GET /Connected: A Tutorial on Web-based Services

Bad Ideas with URI Templates

•  Imagine we’re created an order, what next? •  We could share this URI template:

–  http://restbucks.com/payment/{order_id} •  The order_id field should match the order ID

that came from the restbucks service –  Sounds great!

•  But what if Restbucks outsources payment? –  Change the URI for payments, break the template,

break consumers! •  Be careful what you share!

Page 120: GET /Connected: A Tutorial on Web-based Services

Better Ideas for URI Templates: Entry Points

•  Imagine that we have a well-known entry point to our service –  Which corresponds to a starting point for a protocol

•  Why not advertise that with a URI template? •  For example:

–  http://restbucks.com/signIn/{store_id}/{barista_id}

•  Changes infrequently •  Is important to Restbucks •  Is transparent, and easy to bind to

Page 121: GET /Connected: A Tutorial on Web-based Services

Best Idea for URI Templates: Documentation!

•  Services tend to support lots of resources

•  We need a shorthand for talking about a large number of resources easily

•  We can use a URI template for each “type” of resource that a service (or services) supports

•  But we don’t share this information with others –  Don’t violate encapsulation!

/payment/{order_id}

/{store}/orders

/order/{order_id}

/order/{order_id}

Internal URI Templates

External URI Templates

Page 122: GET /Connected: A Tutorial on Web-based Services

application/xml is not the media type you’re looking for

•  Remember that HTTP is an application protocol –  Headers and representations are intertwined –  Headers set processing context for representations –  Unlike SOAP which can safely ignore HTTP headers

•  It has its own header model

•  Remember that application/xml has a particular processing model –  Which doesn’t include understanding the semantics of links

•  Remember if a representation is declared in the Content-Type header, you must treat it that way –  HTTP is an application protocol – did you forget already?

•  We need real hypermedia formats!

Page 123: GET /Connected: A Tutorial on Web-based Services

Hypermedia Formats

•  Standard –  Wide “reach” –  Software agents already know how to process

them –  But sometimes need to be shoe-horned

•  Self-created –  Can craft specifically for domain –  Semantically rich –  But lack reach

Page 124: GET /Connected: A Tutorial on Web-based Services

Two Common Hypermedia Formats: XHTML and ATOM

•  Both are commonplace today •  Both are hypermedia formats

–  They contain links

•  Both have a processing model that explicitly supports links

• Which means both can describe protocols…

Page 125: GET /Connected: A Tutorial on Web-based Services

XHTML

•  XHTML is just HTML that is also XML •  For example:

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" xmlns:r="http://restbucks.org">

<head> <title> XHTML Example </title> </head> <body> <p> ...

Default XML namespace

Other XML namespaces

Page 126: GET /Connected: A Tutorial on Web-based Services

What’s the big deal with XHTML?

•  It does two interesting things: 1.  It gives us document structure 2.  It gives us links

•  So? 1.  We can understand the format of those resources 2.  We can discover other resources!

•  How? 1.  Follow the links! 2.  Interact with the resources through the uniform

interface •  Contrast this with the APP approach...similar!

Page 127: GET /Connected: A Tutorial on Web-based Services

XHTML in Action <html xmlns="http://www.w3.org/1999/xhtml"> <body> <div class="order">

<p class="location">takeAway</p> <ul class="items"> <li class="item"> <p class="name">latte</p> <p class="quantity">1</p> <p class="milk">whole</p> <p class="size">small</p> </li> </ul> <a href="http://restbucks.com/payment/1234"

rel="payment">payment</a> </div> </body> </html>

Business data

“Hypermedia Control”

Page 128: GET /Connected: A Tutorial on Web-based Services

application/xhtml+xml

•  Can ask which verb the resource at the end of the link supports –  Via HTTP OPTIONS

•  No easy way to tell what each link actually does –  Does it buy the music? –  Does it give you lyrics? –  Does it vote for a favourite album?

•  We lack semantic understanding of the linkspace and resources –  But we have microformats for that semantic stuff!

•  Importantly XHTML is a hypermedia format –  It contains hypermedia controls that can be used to describe

protocols!

Page 129: GET /Connected: A Tutorial on Web-based Services

Atom Syndication Format •  We’ll study this in more depth

later, but for now… •  The application/atom+xml

media type is hypermedia aware

•  You should expect links when processing such representations

•  And be prepared to do things with them!

HTTP/1.1 200 OK

Content-Length: 342 Content-Type: application/atom+xml Date: Sun, 22 Mar 2009 17:04:10 GMT

<entry xmlns="http://www.w3.org/2005/Atom">

<title>Order 1234</title>

<link rel="payment" href="http://restbucks.com/payment/1234"/>

<link rel="special-offer" href="http://restbucks.com/offers/freeCookie"/>

<id>http://restbucks.com/order/1234</id>

<updated>2009-03-22T16:57:02Z</updated>

<summary>1x Cafe Latte</summary> </entry>

Links to other resources, a nascent

protocol

Page 130: GET /Connected: A Tutorial on Web-based Services

application/atom+xml

• No easy way to tell what each link actually does –  But look at the way the rel attribute is

being used –  Can we inject semantics there?

•  Atom is a hypermedia format –  Both feeds and entries contains hypermedia

controls that can describe protocols

Page 131: GET /Connected: A Tutorial on Web-based Services

application/vnd.restbucks+xml

•  What a mouthful! •  The vnd namespace is for proprietary media

types –  As opposed to the IANA-registered ones

•  Restbucks XML is a hybrid –  We use plain old XML to convey information –  And Atom link elements to convey protocol

•  This is important, since it allows us to create RESTful, hypermedia aware services

Page 132: GET /Connected: A Tutorial on Web-based Services

Hypermedia and RESTful Services

Page 133: GET /Connected: A Tutorial on Web-based Services

Revisiting Resource Lifetime

•  On the Web, the lifecycle of a single resource is more than: –  Creation –  Updating –  Reading –  Deleting

•  Can also get metadata –  About the resource –  About its (subset of) the verbs it understands

•  And resources tell us about other resources we might want to interact with… –  A protocol!

Page 134: GET /Connected: A Tutorial on Web-based Services

Links

•  Connectedness is good in Web-based systems

•  Resource representations can contain other URIs

•  Links act as state transitions •  Application (conversation) state is

captured in terms of these states

Page 135: GET /Connected: A Tutorial on Web-based Services

Describing Contracts with Links

•  The value of the Web is its “linked-ness” –  Links on a Web page constitute a contractfor page

traversals •  The same is true of the programmatic Web •  Use Links to describe state transitions in

programmatic Web services –  By navigating resources you change application state

•  Hypermedia formats support this –  Allow us to describe higher-order protocols which sit

comfortably atop HTTP –  Hence application/vnd.restbucks+xml

Page 136: GET /Connected: A Tutorial on Web-based Services

Links are State Transitions

Page 137: GET /Connected: A Tutorial on Web-based Services

Links as APIs <confirm xmlns="...">

<link rel="payment" href="https://pay"

type="application/xml"/> <link rel="postpone"

href="https://wishlist" type="application/xml"/>

</confirm>

•  Following a link causes an action to occur

•  This is the start of a state machine!

•  Links lead to other resources which also have links

•  Can make this stronger with semantics –  Microformats

Page 138: GET /Connected: A Tutorial on Web-based Services

We have a framework!

•  The Web gives us a processing and metadata model –  Verbs and status codes –  Headers

•  Gives us metadata contracts or Web “APIs” –  URI Templates –  Links

•  Strengthened with semantics –  Little “s”

Page 139: GET /Connected: A Tutorial on Web-based Services

Richardson Model Level 3 •  Lots of URIs that address

resources •  Embraces HTTP as an

application protocol •  Resource representations and

formats identify other resources –  RESTful at last!

Page 140: GET /Connected: A Tutorial on Web-based Services

Workflow

•  How does a typical enterprise workflow look when it’s implemented in a Web-friendly way?

•  Let’s take Starbuck’s as an example, the happy path is: –  Make selection

• Add any specialities –  Pay –  Wait for a while –  Collect drink

Page 141: GET /Connected: A Tutorial on Web-based Services

Workflow and MOM •  With Web Services we

exchange messages with the service

•  Resource state is hidden from view

•  Conversation state is all we know –  Advertise it with

SSDL, BPEL •  Uniform interface,

roles defined by SOAP –  No “operations”

Page 142: GET /Connected: A Tutorial on Web-based Services

Web-friendly Workflow

•  What happens if workflow stages are modelled as resources? •  And state transitions are modelled as hyperlinks or URI

templates? •  And events modelled by traversing links and changing

resource states? •  Answer: we get Web-friendly workflow

–  With all the quality of service provided by the Web

•  So let’s see how we order a coffee at Restbucks.com… –  This is written up on the Web:

•  http://www.infoq.com/articles/webber-rest-workflow

Page 143: GET /Connected: A Tutorial on Web-based Services

Placing an Order

•  Place your order by POSTing it to a well-known URI –  http://example.restbucks.com/order

Client

Star

buck

’s S

ervi

ce

Page 144: GET /Connected: A Tutorial on Web-based Services

Placing an Order: On the Wire •  Request POST /order HTTP 1.1 Host: restbucks.com Content-Length: ...

<order xmlns="urn:restbucks"> <drink>latte</drink> </order>

•  Response 201 Created Location: http://restbucks.com/

order/1234 Content-Type: application/

vnd.restbucks+xml

Content-Length: ...

<order xmlns="urn:restbucks"> <drink>latte</drink> <link rel="payment"

href="https://restbucks.com/payment/order/1234"

type="application/xml"/> </order>

A link! Is this the start of an API?

If we have a (private) microformat, this can become a neat API!

Page 145: GET /Connected: A Tutorial on Web-based Services

Whoops! A mistake

•  I like my coffee to taste like coffee! •  I need another shot of espresso

–  What are my OPTIONS?

 Request OPTIONS /order/1234 HTTP 1.1

Host: restbucks.com

 Response 200 OK

Allow: GET, PUT Phew! I can update my

order, for now

Page 146: GET /Connected: A Tutorial on Web-based Services

Optional: Look Before You Leap

•  See if the resource has changed since you submitted your order –  If you’re fast your drink hasn’t been

prepared yet

 Request PUT /order/1234 HTTP 1.1

Host: restbucks.com

Expect: 100-Continue

 Response 100 Continue

I can still PUT this resource, for now. (417 Expectation Failed otherwise)

Page 147: GET /Connected: A Tutorial on Web-based Services

Amending an Order

•  Add specialities to you order via PUT –  Restbucks needs 2 shots!

Client

Star

buck

’s S

ervi

ce

Page 148: GET /Connected: A Tutorial on Web-based Services

Amending an Order: On the Wire •  Request PUT /order/1234 HTTP 1.1 Host: restbucks.com Content-Type: application/

vnd.restbucks+xml Content-Length: ...

<order xmlns="urn:restbucks"> <drink>latte</drink> <additions>shot</additions> <link rel="payment"

href="https://restbucks.com/payment/order/1234"

type="application/xml"/> </order>

•  Response 200 OK Location: http://restbucks.com/

order/1234 Content-Type: application/

vnd.restbucks+xml Content-Length: ...

<order xmlns="urn:restbucks"> <drink>latte</drink> <additions>shot</additions> <link rel="payment"

href="https://restbucks.com/payment/order/1234"

type="application/xml"/> </order>

Page 149: GET /Connected: A Tutorial on Web-based Services

Side Note: PATCH

•  PUT demands a full representation as the entity body –  Onerous if the payload is large

•  PATCH is a verb that supports diffs –  The server figures out how to apply those

diffs

•  Still a proposal at this point –  Boo!

Page 150: GET /Connected: A Tutorial on Web-based Services

Statelessness

•  Remember interactions with resources are stateless •  The resource “forgets” about you while you’re not

directly interacting with it •  Which means race conditions are possible •  Use If-Unmodified-Since on a timestamp to

make sure –  Or use If-Match and an ETag

•  You’ll get a 412 PreconditionFailed if you lost the race –  But you’ll avoid potentially putting the resource into

some inconsistent state

Page 151: GET /Connected: A Tutorial on Web-based Services

Warning: Don’t be Slow! •  Can only make changes until someone

actually makes your drink –  You’re safe if you use If-Unmodified-Since

or If-Match –  But resource state can change without you!

 Request PUT /order/1234 HTTP 1.1

Host: restbucks.com

...

 Response 409 Conflict

Too slow! Someone else has changed the state of my order

 Request OPTIONS /order/1234 HTTP 1.1

Host: restbucks.com

 Response Allow: GET

Page 152: GET /Connected: A Tutorial on Web-based Services

Order Confirmation

•  Check your order status by GETing it

Client

Star

buck

’s S

ervi

ce

Page 153: GET /Connected: A Tutorial on Web-based Services

Order Confirmation: On the Wire •  Request GET /order/1234 HTTP 1.1 Host: restbucks.com

•  Response 200 OK Location: http://restbucks.com/

order/1234 Content-Type: application/

vnd.restbucks+xml Content-Length: ...

<order xmlns="urn:restbucks"> <drink>latte</drink> <additions>shot</additions> <link rel="payment" href="https://

restbucks.com/payment/order/1234"

type="application/xml"/> </order>

Are they trying to tell me something with hypermedia?

Page 154: GET /Connected: A Tutorial on Web-based Services

Order Payment •  PUT your payment to the order resource

https://restbucks.com/payment/order/1234

Client

Star

buck

’s S

ervi

ce

New resource! https://restbucks.com/payment/order/1234

Page 155: GET /Connected: A Tutorial on Web-based Services

How did I know to PUT? •  The client knew the URI to PUT to from the link

–  PUT is also idempotent (can safely re-try) in case of failure •  Verified with OPTIONS

–  Just in case you were in any doubt

 Request OPTIONS /payment/order/1234 HTTP 1.1

Host: restbucks.com

 Response Allow: GET, PUT

Page 156: GET /Connected: A Tutorial on Web-based Services

Order Payment: On the Wire

•  Request PUT /payment/order/1234 HTTP 1.1 Host: restbucks.com Content-Type: application/xml Content-Length: ...

<payment xmlns="urn:restbucks"> <cardNo>123456789</cardNo> <expires>07/07</expires> <name>John Citizen</name> <amount>4.00</amount> </payment>

•  Response 201 Created Location: https://

restbucks.com/payment/order/1234

Content-Type: application/xml Content-Length: ...

<payment xmlns="urn:restbucks"> <cardNo>123456789</cardNo> <expires>07/07</expires> <name>John Citizen</name> <amount>4.00</amount> </payment>

Page 157: GET /Connected: A Tutorial on Web-based Services

Check that you’ve paid •  Request GET /order/1234 HTTP 1.1 Host: restbucks.com

•  Response 200 OK Content-Type: application/

vnd.restbucks+xml Content-Length: ...

<order xmlns="urn:restbucks"> <drink>latte</drink> <additions>shot</additions>

</order>

My “API” has changed, because I’ve paid

enough now

Page 158: GET /Connected: A Tutorial on Web-based Services

What Happened Behind the Scenes?

•  Restbucks can use the same resources! •  Plus some private resources of their own

–  Master list of coffees to be prepared

•  Authenticate to provide security on some resources –  E.g. only Starbuck’s are allowed to view

payments

Page 159: GET /Connected: A Tutorial on Web-based Services

Payment •  Only Restbucks systems can access the record of payments

–  Using the URI template: http://.../payment/order?{order_id}

•  We can use HTTP authorisation to enforce this

 Request GET /payment/order/1234 HTTP 1.1 Host: restbucks.com

 Response 401 Unauthorized WWW-Authenticate: Digest realm="restbucks.com", qop="auth", nonce="ab656...", opaque="b6a9..."

 Request GET /payment/order/1234 HTTP 1.1 Host: restbucks.com Authorization: Digest username="jw" realm="restbucks.com“ nonce="..." uri="payment/order/1234" qop=auth nc=00000001 cnonce="..." reponse="..." opaque="..."

 Response 200 OK Content-Type: application/xml Content-Length: ...

<payment xmlns="urn:restbucks"> <cardNo>123456789</cardNo> <expires>07/07</expires> <name>John Citizen</name> <amount>4.00</amount> </payment>

Page 160: GET /Connected: A Tutorial on Web-based Services

Master Coffee List •  /orders URI for all orders, only accepts GET

–  Anyone can use it, but it is only useful for Starbuck’s –  It’s not identified in any of our public APIs anywhere, but the back-

end systems know the URI

 Request GET /orders HTTP 1.1

Host: restbucks.com

  Response 200 OK Content-Type: application/atom+xml Content-Length: ...

<?xml version="1.0" ?> <feed xmlns="http://www.w3.org/2005/Atom"> <title>Coffees to make</title> <link rel="alternate" href="http://example.restbucks.com/order.atom"/> <updated>2007-07-10T09:18:43Z</updated> <author><name>Johnny Barrista</name></author> <id>urn:starkbucks:45ftis90</id>

<entry> <link rel="alternate" type="application/xml" href="http://restbucks.com/order/1234"/> <id>urn:restbucks:a3tfpfz3</id> </entry> ... </feed>

Atom feed!

Page 161: GET /Connected: A Tutorial on Web-based Services

Finally drink your coffee...

Source: http://images.businessweek.com/ss/06/07/top_brands/image/restbucks.jpg

Page 162: GET /Connected: A Tutorial on Web-based Services

What did we learn from Restbucks?

•  HTTP has a header/status combination for every occasion •  APIs are expressed in terms of links, and links are great!

–  APP-esque APIs •  APIs can also be constructed with URI templates and

inference –  But beware tight coupling outside of CRUD services!

•  XML is fine, but we could also use formats like Atom, JSON or even default to XHTML as a sensible middle ground

•  State machines (defined by links) are important –  Just as in Web Services…

Page 163: GET /Connected: A Tutorial on Web-based Services

Scalability

Page 164: GET /Connected: A Tutorial on Web-based Services

Statlessness

•  Every action happens in isolation –  This is a good thing!

•  In between requests the server knows nothing about you –  Excepting any state changes you caused when

you last interacted with it. •  Keeps the interaction protocol simpler

–  Makes recovery, scalability, failover much simpler too

–  Avoid cookies!

Page 165: GET /Connected: A Tutorial on Web-based Services

Application vs Resource State

•  Useful services hold persistent data – Resource state –  Resources are buckets of state –  What use is Google without state?

•  Brittle implementations have application state –  They support long-lived conversations –  No failure isolation –  Poor crash recovery –  Hard to scale, hard to do fail-over fault tolerance

•  Recall stateless Web Services – same applies in the Web too!

Page 166: GET /Connected: A Tutorial on Web-based Services

Stateful Example

What if there’s a failure here?

Page 167: GET /Connected: A Tutorial on Web-based Services

Stateful Failure

“Grandmother” Antipattern

Page 168: GET /Connected: A Tutorial on Web-based Services

Stateless System Tolerates Intermittent Failures

Page 169: GET /Connected: A Tutorial on Web-based Services

Scaling Horizontally

• Web farms have delivered horizontal scaling for years –  Though they sometimes do clever things with

session affinity to support cookie-based sessions

•  In the programmatic Web, statelessness enables scalability –  Just like in the Web Services world

Page 170: GET /Connected: A Tutorial on Web-based Services

Scalable Deployment Configuration •  Deploy services onto many servers •  Services are stateless

–  No cookies! •  Servers share only back-end data

Page 171: GET /Connected: A Tutorial on Web-based Services

Scaling Vertically…without servers

•  The most expensive round-trip: –  From client –  Across network –  Through servers –  Across network again –  To database –  And all the way back!

•  The Web tries to short-circuit this –  By determining early if there is any actual work to

do! –  And by caching

Page 172: GET /Connected: A Tutorial on Web-based Services

Caching

•  Caching is about scaling vertically –  As opposed to horizontally

• Making a single service run faster –  Rather than getting higher overall throughput

•  In the programmatic Web it’s about reducing load on servers –  And reducing latency for clients

Page 173: GET /Connected: A Tutorial on Web-based Services

Caching in a Scalable Deployment •  Cache (reverse proxy) in front of server farm

–  Avoid hitting the server •  Proxy at client domain

–  Avoid leaving the LAN •  Local cache with client

–  Avoid using the network

Page 174: GET /Connected: A Tutorial on Web-based Services

Being workshy is a good thing!

•  Provide guard clauses in requests so that servers can determine easily if there’s any work to be done –  Caches too

•  Use headers: –  If-Modified-Since –  If-None-Match –  And friends

•  Web infrastructure uses these to determine if its worth performing the request –  And often it isn’t –  So an existing representation can be returned

Page 175: GET /Connected: A Tutorial on Web-based Services

Conditional GET Avoids Work!

•  Bandwidth-saving pattern •  Requires client and server to work together •  Server sends Last-Modified and/or ETag headers

with representations •  Client sends back those values when it interacts with

resource in If-Modified-Since and/or If-None-Match headers

•  Server responds with a 200 an empty body if there have been no updates to that resource state

•  Or gives a new resource representation (with new Last-Modified and/or ETag headers)

ETag is an opaque identifier for a particular resource/version

Page 176: GET /Connected: A Tutorial on Web-based Services

Retrieving a Resource Representation

•  Request GET /orders/1234 HTTP 1.1 Host: restbucks.com Accept: application/restbucks-xml If-Modified-Since: 2009-01-08T15:00:34Z If-None-Match: aabd653b-65d0-74da-bc63-4bca-ba3ef3f50432 •  Response 200 OK Content-Type: application/vnd.restbucks+xml Content-Length: ... Last-Modified: 2009-01-08T15:10:32Z Etag: abbb4828-93ba-567b-6a33-33d374bcad39

<order … />

Page 177: GET /Connected: A Tutorial on Web-based Services

Not Retrieving a Resource Representation

•  Request

GET /orders/1234 HTTP 1.1 Host: restbucks.com Accept: application/restbucks.xml If-Modified-Since: 2009-01-08T15:00:34Z If-None-Match: aabd653b-65d0-74da-bc63-4bca-ba3ef3f50432

•  Response HTTP/1.x 304 Not Modified

Client’s representation of the resource is up-to-

date

Page 178: GET /Connected: A Tutorial on Web-based Services

Works with other verbs too

PUT /orders/1234 HTTP 1.1 Host: restbucks.com Accept: application/vnd.restbucks+xml

If-Modified-Since: 2007-07-08T15:00:34Z

If-None-Match: aabd653b-65d0-74da-bc63-4bca-ba3ef3f50432

<order …/>

Page 179: GET /Connected: A Tutorial on Web-based Services

PUT Results in no Work Done

200 OK Content-Type: application/xml Content-Length: ... Last-Modified: 2007-07-08T15:00:34Z

Etag: aabd653b-65d0-74da-bc63-4bca-ba3ef3f50432

Page 180: GET /Connected: A Tutorial on Web-based Services

Atom and AtomPub

Page 181: GET /Connected: A Tutorial on Web-based Services

Syndication History

•  Originally syndication used to provide feeds of information –  Same information available on associated Web sites

•  Intended to be part of the "push" Web –  And allow syndication etc

•  RSS was the primary driver here –  Several versions, loosely described

•  Simple!

•  ATOM followed –  Format and protocol –  Richer than RSS and now being used for the programmatic

Web

Page 182: GET /Connected: A Tutorial on Web-based Services

Syndication

•  Syndication: re-publishing articles collected from a variety of news sources

•  Aggregation, grouping now commonplace on the Web –  Because we’re not tied to print

Page 183: GET /Connected: A Tutorial on Web-based Services

Feed Architecture

RSS Client (rich client,

web app, etc)

Public aggregator

Company feed

News feed

Blog feed

News feed

Bank account

Uniform interface!

Uniform interface!

Page 184: GET /Connected: A Tutorial on Web-based Services

Atom

•  Atom is a format –  Aka Atom Syndication Format –  XML format for lists of (time-stamped)

entries • Aka feeds

–  Not to be confused with Atom Publishing Protocol

Page 185: GET /Connected: A Tutorial on Web-based Services

Atom Feeds

•  Atom feeds contain useful information aimed at supporting publishing –  Its primary domain is weblogs, syndication,

etc

•  Better than XHTML in this case? –  Because it’s more specific to the problem

domain

•  Atom lists are known as feeds •  Items in Atom lists are known as entries

Page 186: GET /Connected: A Tutorial on Web-based Services

Anatomy of an Atom Feed

• Media type: application/atom+xml

<?xml version="1.0" encoding="utf-8"/> <feed xmlns="http://www.w3.org/2005/Atom"> <title>GET Connected</title> <link rel="alternate" href="http://restbucks.com"/> <updated>2007-07-01T13:00:44Z</updated> <author><name>Jim Webber</name></author> <contributor><name>Savas Parastatidis</name></

contributor>

<contributor><name>Ian Robinson</name></contributor> <id>urn:ab45fe7e-7ff3-886c-11d2-7da3fe465322</id>

Feed metadata

HTTP metadata

Page 187: GET /Connected: A Tutorial on Web-based Services

More Anatomy of an Atom Feed <entry>

<title>Chapter 10 complete, says Webber</title> <link rel="service.edit" type="application/atom+xml"

href="http://restbucks/c10.aspx"/>

<link rel="service.post" type="application/atom+xml" href="http://restbucks.com/c10.aspx">

<id>urn:dd64ef10-975d-23de-13fa-33d32117acb432</id> <updated>2007-07-01T13:00:44Z</updated> <summary>Chapter 10 deals with the comparison of Web and Web Services

approaches to building distributed applications.

</summary> <category scheme="http://restbucks.com/categories/books"

term="local" label="book news"/>

</entry> </feed>

AtomPub API! (more later)

Page 188: GET /Connected: A Tutorial on Web-based Services

Atom Feeds Analogy Resource

Name

Entry name

Creator(s)

Content

Location

Page 189: GET /Connected: A Tutorial on Web-based Services

Atom Feeds and Resources

•  Atom is just a resource representation –  Like XHTML but structured for lists of things

•  An Atom feed is a good resource representation for returning resources in response to a query –  It’s also hypermedia-friendly!

Page 190: GET /Connected: A Tutorial on Web-based Services

Atom Extensibility •  Q: What if your resource representations don’t fit in Atom entries

directly? •  A: Use your own data!

<entry> <title>Chapter 10 complete, says Webber</title> ... <jw:openIssues xmlns:jw="http://jim.webber.name/bugs"> <jw:issue title="Colour diagrams degraded in monochrome"> <jw:status>closed</jw:closed> <jw:actionTaken date="2007-06-28T16:44:12Z"> <jw:takenBy>[email protected]</jw:takenBy> <jw:description>re-drew all diagrams</jw:description> </jw:actionTaken> </jw:issue> ... <jw:openIssues> </entry>

This will be ignored if your client application

doesn’t know the namespace

Page 191: GET /Connected: A Tutorial on Web-based Services

Atom Publishing Protocol

•  APP defines a set of resources that handle publishing Atom documents –  Four kinds of resources

• Collection • Member •  Service Document • Category Document

–  And their representations on the wire

•  Another uniform interface atop the HTTP uniform interface

Page 192: GET /Connected: A Tutorial on Web-based Services

APP: Collections

•  Collection’s representation is an Atom feed •  APP defines semantics for the collection

representation –  GET – retrieve the collection/feed –  POST – adds a new member to the collection

• Adds a new entry to the feed

–  PUT and DELETE undefined by APP • But probably should delete a collection or update a

collection in place respectively

Page 193: GET /Connected: A Tutorial on Web-based Services

Service Document Example

•  Represents a group of collections –  Think: service metadata!

•  Media type: application/atomserv+xml •  Key features:

–  Workspaces •  Containers for collections

–  Accept •  The media type of the collections

–  Categories •  Metadata describing the kind of the collection

•  Designed to help the discovery process

Page 194: GET /Connected: A Tutorial on Web-based Services

Service Document Example <service xmlns="http://purl.org/atom/app#" xmlns:atom="http://

www.w3.org/2005/Atom"> <workspace>

<atom:title>Genome Data</atom:title> <collection href="http://genomes-r-us.com">

<atom:title>Human Genome</atom:title> <accept>application/atom+xml</accept>

<categories href="http://genomes-r-us.com/human"/>

</collection> </workspace>

<workspace> <atom:title>Cellular Images</atom:title>

<collection href="http://cells-r-us.com">

<atom:title>Human Cells</atom:title> <accept>image/*</accept>

<categories href="http://cells-r-us.com/human"/> </collection>

</workspace>

What to POST to a resource

What to POST to a resource

Page 195: GET /Connected: A Tutorial on Web-based Services

Category Document

• More metadata –  Attach a meaningful category scheme to

resources –  Think: namespaced microformat for Atom

entries

• Media type: application/atomcat+xml

Page 196: GET /Connected: A Tutorial on Web-based Services

Category Document Example

<app:categories xmlns:app="http://purl.org/app#" xmlns="http://www.w3.org/2005.Atom" scheme="http://genomes-r-us/" fixed="yes">

<category term="human" label="human dna"/>

<category term="fruitfly" label="insect dna"/>

</app:categories>

Can only use categories in this

scheme

Page 197: GET /Connected: A Tutorial on Web-based Services

Atom and Binary Payloads

•  APP defines a custom HTTP header: –  Slug

•  Which describes a binary payload while it’s being uploaded

POST /music/indie HTTP/1.1 Host: jim.webber.name Content-type: audio/mpeg Content-length: 6281125 Slug: Hounds of Love by Futureheads [mp3 content here]

•  And gets a response: 201 Created Location: http://jim.webber.name/music/indie/hounds-of-

love.atom

Page 198: GET /Connected: A Tutorial on Web-based Services

Where’s the file?

•  Look again at the location: –  http://jim.webber.name/music/indie/hounds-of-love.atom

•  It refers to an Atom entry not to the uploaded file!

<entry> <title>Hounds of Love</entry> <updated>2007-07-02T11:52:29Z</updated> <id>55442bcd-77ae-83ce-234d-0f2754ca754c</id> <summary/> <link rel="service.edit" type="audio/mpeg" href="http://

jim.webber.name/music/indie/hounds-of-love.mp3" /> </entry>

Title, derived from the Slug

Server generated

Server generated, based on slug

Page 199: GET /Connected: A Tutorial on Web-based Services

What Can I do with this?

•  Anything you can do with any other Atom entry

•  Add a summary via PUT, or change the title •  Delete the resource, or anything else that

HTTP supports –  The binary file will be deleted too!

•  The beauty of Atom is that it splits binaries into binary (which can’t go into a feed) and metadata (which can)

Page 200: GET /Connected: A Tutorial on Web-based Services

AtomPub: A URI-based API

•  Each Atom service supports a number of URIs for manipulating its resources –  An API expressed in terms of links!

•  PostURI (returns an Edit URI in the header) –  Per service

•  EditURI –  Per resource

•  FeedURI –  Per query

•  ResourcePostURI –  Per service

Creates entries and sub-entries

Page 201: GET /Connected: A Tutorial on Web-based Services

Post URI

•  Post URI used to create entries –  Either full entries (e.g. weblog post), –  Or additions (e.g. comments)

•  Client POSTs completed Atom Entry to this URI •  To create new entry, the link tag is used

–  Link tag is used in HTML and Atom –  In head element (HTML), children of the Feed element (Atom).

•  <link rel="service.post" type="application/atom+xml" href="{post-uri}" title="some title">

•  Note: Multiple link tags may appear together and can be distinguished by having different 'rel', 'type' and 'title' attributes.

Source: ATOM API Quick Reference Joe Gregorio, http://bitworking.org/news/AtomAPI_Quick_Reference

Look out! The rel tag is case sensitive!

Page 202: GET /Connected: A Tutorial on Web-based Services

Edit URI

•  Edit URI used to edit an existing Entry. •  Client does a GET on the URI to retrieve the Atom

Entry, •  Then modifies the Entry and then PUTs the Entry

back to the same URI. •  Can use DELETE method on the URI to remove the

Entry •  A link tag in either HTML or Atom of the following

form will point to the URI for editing an Entry. <link rel="service.edit" type="application/atom+xml" href="{edit-uri}" title="some title">

Page 203: GET /Connected: A Tutorial on Web-based Services

Feed URI •  Feed URI used to retrieve Atom feed •  Feed may contain just Entries for syndication, or may contain

additional link tags for browsing/navigating around to other feeds •  A typical feed for syndication would be indicated by a link tag of

the form: •  <link href="feed-uri" type="application/atom+xml"

rel="feed" title="feed title"/> •  If a feed contains extra link tags for navigation, it might be

supplied specifically for client to use, then it is of the form: •  <link href="feed-uri" type="application/atom+xml"

rel="feed.browse" title="feed title"/> Or •  <link href="feed-uri" type="application/atom+xml"

rel="prev" title="feed title"/>

Page 204: GET /Connected: A Tutorial on Web-based Services

Consuming Feeds in Applications

•  Feeds on the Internet have so-far been used to optimise the human Web –  Site summaries, blog posting, etc

• However feeds are a data structure –  And so potentially machine-processable

•  Embedding machine-readable payloads means we have a vehicle for computer-computer interaction

Page 205: GET /Connected: A Tutorial on Web-based Services

Using Atom and AtomPub for Eventing Atom What does published information look like?

XMLvocabularyFeed DirectoryofpublishedresourcesEntry Content,orlinktocontent

AtomPubHowdoyoupublishinformation?Protocol

Discover Service&categorydocumentsPublish Collectionsandmembers POST,GET,PUT,DELETE Responsecodesandlocationheaders

Conflicts ETags,conditionalGETs

Page 206: GET /Connected: A Tutorial on Web-based Services

On the Wire GET /products/notifications.atom HTTP/1.1 Host: example.com

Request

Response

HTTP/1.1 200 OK Cache-Control: max-age=60 Content-Length: 12230 Content-Type: application/atom+xml;charset="utf-8" Content-Location: http://example.com/products/notifications/2008/9/10/13.atom Last-Modified: Wed, 10 Sep 2008 13:50:32 GMT ETag: "6a0806ca" Date: Wed, 10 Sep 2008 13:51:03 GMT

<feed xmlns="http://www.w3.org/2005/Atom"><title type="text">Product Notifications</title><id>urn:uuid:be21b6b0-57b4-4029-ada4-09585ee74adc</id><updated>2008-09-10T14:50:32+01:00</updated><author><name>Product Management</name><uri>http://example.com/products</uri></author><link rel="self" href="http://example.com/products/notifications/2008/9/10/14.atom"/><link rel="next" href="http://example.com/products/notifications/2008/9/10/13.atom"/><entry><id>urn:uuid:95506d98-aae9-4d34-a8f4-1ff30bece80c</id><title type="text">product created</title><updated>2008-09-10T14:45:32+01:00</updated><link rel="self" href="http://example.com/products/notifications/95506d98-aae9-4d34-a8f4-1ff30bece80c.atom"/><category term="product"/><category term="created"/><content type="application/xml"><ProductCreated xmlns ="http://example.com/products"><Id>52 ...

GET /products/notifications.atom HTTP/1.1

Cache-Control: max-age=60

Content-Location: http://example.com/products/notifications/2008/9/10/13.atom

ETag: "6a0806ca"

Page 207: GET /Connected: A Tutorial on Web-based Services

Atom Feed Represents an Event Stream <feed xmlns="http://www.w3.org/2005/Atom">

<title type="text">Product Notifications</title> <id>urn:uuid:be21b6b0-57b4-4029-ada4-09585ee74adc</id> <updated>2008-09-10T14:50:32+01:00</updated> <author> <name>Product Management</name> <uri>http://example.com/products</uri> </author>

<link rel="self" href="http://example.com/products/notifications/2008/9/10/13.atom"/> <link rel="next" href="http://example.com/products/notifications/2008/9/10/12.atom"/>

<entry> <id>urn:uuid:95506d98-aae9-4d34-a8f4-1ff30bece80c</id> <title type="text">product created</title> <updated>2008-09-10T14:45:32+01:00</updated>

<link rel="self" href="http://example.com/products/notifications/95506d98-aae9-4d34-a8f4-1ff30bece80c.atom"/>

<category term="product"/> <category term="created"/>

<content type="application/xml"> <ProductCreated xmlns="http://example.com/products"> <Id>527</Id> <Href>http://example.com/products/product/527</Href> <Version>1</Version> <Code>DP</Code> <Name>Digital Phone</Name> ...

<link rel="self" href="http://example.com/products/notifications/2008/9/10/13.atom"/> <link rel="next" href="http://example.com/products/notifications/2008/9/10/12.atom"/>

<link rel="self" href="http://example.com/products/notifications/95506d98-aae9-4d34-a8f4-1ff30bece80c.atom"/>

<title type="text">product created</title>

<category term="product"/> <category term="created"/>

Page 208: GET /Connected: A Tutorial on Web-based Services

Retrieving the Archive by Following Links

GET /products/notifications/2008/9/10/12.atom HTTP/1.1 Host: example.com

Request

Response

HTTP/1.1 200 OK Cache-Control: max-age=2592000 Content-Length: 9877 Content-Type: application/atom+xml;charset="utf-8" Last-Modified: Wed, 10 Sep 2008 12:57:14 GMT Date: Wed, 10 Sep 2008 13:51:46 GMT

<feed xmlns="http://www.w3.org/2005/Atom"><title type="text">Product Notifications</title><id>urn:uuid:4cbc0acf-a211-40ce-a50e-a75d299571da</id><updated>2008-09-10T13:57:14+01:00</updated><author><name>Product Management</name><uri>http://example.com/products</uri></author><link rel="self" href="http://example.com/products/notifications/2008/9/10/12.atom"/><link rel="next" href="http://example.com/products/notifications/2008/9/10/11.atom"/><link rel="previous" href="http://example.com/products/notifications/2008/9/10/13.atom"/><entry><id>urn:uuid:b436fda6-93f5-4c00-98a3-06b62c3d31b8</id><title type="text">hardware deprecated</title><updated>2008-09-10T13:57:14+01:00</updated><link rel="self" href="http://example.com/products/notifications/b436fda6-93f5-4c00-98a3-06b62c3d31b8.atom"/><category term="hardware"/><category term="deprecated"/><content type="application/xml"><HardwareDeprecated xmlns ="http://example.com/products"><Id> ...

GET /products/notifications/2008/9/10/12.atom HTTP/1.1

Cache-Control: max-age=2592000

Page 209: GET /Connected: A Tutorial on Web-based Services

Archive <feed xmlns="http://www.w3.org/2005/Atom"> <title type="text">Product Notifications</title> <id>urn:uuid:4cbc0acf-a211-40ce-a50e-a75d299571da</id> <updated>2008-09-10T13:57:14+01:00</updated> <author> <name>Product Management</name> <uri>http://example.com/products</uri> </author>

<link rel="self" href="http://example.com/products/notifications/2008/9/10/12.atom"/> <link rel="next" href="http://example.com/products/notifications/2008/9/10/11.atom"/> <link rel="previous" href="http://example.com/products/notifications/2008/9/10/13.atom"/>

<entry> <id>urn:uuid:b436fda6-93f5-4c00-98a3-06b62c3d31b8</id> <title type="text">hardware deprecated</title> <updated>2008-09-10T13:57:14+01:00</updated>

<link rel="self" href="http://example.com/products/notifications/b436fda6-93f5-4c00-98a3-06b62c3d31b8.atom"/>

<category term="hardware"/> <category term="deprecated"/>

<content type="application/xml"> <HardwareDeprecated xmlns="http://example.com/products"> <Id>391</Id> </HardwareDeprecated> </content> </entry> ...

<link rel="self" href="http://example.com/products/notifications/2008/9/10/12.atom"/> <link rel="next" href="http://example.com/products/notifications/2008/9/10/11.atom"/> <link rel="previous" href="http://example.com/products/notifications/2008/9/10/13.atom"/>

Page 210: GET /Connected: A Tutorial on Web-based Services

Navigating the Archive Latest Archive

http://example.com/products/notifications.atom

10Sept200813:00–14:00

10Sept200812:00–13:00

10Sept200811:00–12:00

10Sept200810:00–11:00

http://example.com/products/notifications/2008/9/10/12.atomhttp://example.com/products/notifications/2008/9/10/11.atomhttp://example.com/products/notifications/2008/9/10/13.atom

LocalCache

Page 211: GET /Connected: A Tutorial on Web-based Services

Handling Eager Re-polling

GET /products/notifications.atom HTTP/1.1 Host: example.com If-None-Match: "6a0806ca"

Request

Response

HTTP/1.1 304 Not Modified Date: Wed, 10 Sep 2008 13:57:20 GMT

If-None-Match: "6a0806ca"

HTTP/1.1 304 Not Modified

Page 212: GET /Connected: A Tutorial on Web-based Services

An Alternative Feed Format <feed xmlns="http://www.w3.org/2005/Atom">

<entry> <id>urn:uuid:95506d98-aae9-4d34-a8f4-1ff30bece80c</id> <title type="text">product created</title> <updated>2008-09-10T14:45:32+01:00</updated> <link rel="alternate" href="http://example.com/products/notifications/95506d98-aae9-4d34-a8f4-1ff30bece80c.atom"/> <category term="product"/> <category term="created"/> </entry>

<entry> <id>urn:uuid:a1ec0fba-faa7-4d73-b6ce-c69c86c205b6</id> <title type="text">product deprecated</title> <updated>2008-09-10T14:37:20+01:00</updated> <link rel="alternate" href="http://example.com/products/notifications/a1ec0fba-faa7-4d73-b6ce-c69c86c205b6.atom"/> <category term="product"/> <category term="deprecated"/> </entry>

<entry> <id>urn:uuid:f6e14ff6-4007-498e-9d68-076b3b8b0ed2</id> <title type="text">hardware updated</title> <updated>2008-09-10T14:21:46+01:00</updated> <link rel="alternate" href="http://example.com/products/notifications/f6e14ff6-4007-498e-9d68-076b3b8b0ed2.atom"/> <category term="hardware"/> <category term="updated"/> </entry> ...

<link rel="alternate" href="http://example.com/products/notifications/a1ec0fba-faa7-4d73-b6ce-c69c86c205b6.atom"/>

<category term="product"/> <category term="created"/>

Page 213: GET /Connected: A Tutorial on Web-based Services

Atom Entry Represents an Event <entry xmlns="http://www.w3.org/2005/Atom"> <id>urn:uuid:95506d98-aae9-4d34-a8f4-1ff30bece80c</id> <title type="text">product created</title> <updated>2008-09-10T14:45:32+01:00</updated> <link rel="self" href="http://example.com/products/notifications/95506d98-aae9-4d34-a8f4-1ff30bece80c.atom"/> <category term="product"/> <category term="created"/> <content type="application/xml"> <ProductCreated xmlns="http://example.com/products"> <Id>527</Id> <Link etag="1">http://example.com/products/product/527.xml</Link> <Code>DP</Code> <Name>Digital Phone</Name> <Price>120.00</Price> <Features> <Feature> <Name>Voice mail</Name> </Feature> <Feature> <Name>Call waiting</Name> </Feature> </Features> <Hardware> <Hardware> <Id>931</Id> <Link etag="6">http://example.com/products/hardware/931.xml</Link> </Hardware> </Hardware> </ProductCreated> </content> </entry>

<Link etag="1">http://example.com/products/product/527.xml</Link>

<Hardware> <Id>931</Id> <Link etag="6">http://example.com/products/hardware/931.xml</Link> </Hardware>

<Feature> <Name>Voice mail</Name> </Feature>

Page 214: GET /Connected: A Tutorial on Web-based Services

Is This the Latest Version of an Entity?

HEAD /products/hardware/931.xml HTTP/1.1 Host: example.com If-None-Match: "6"

Request

Response

HTTP/1.1 304 Not Modified Date: Fri, 12 Sep 2008 09:00:34 GMT

Page 215: GET /Connected: A Tutorial on Web-based Services

Connectedness

feed entry

product hardware

Events

Entities

Page 216: GET /Connected: A Tutorial on Web-based Services

Handling Conflicts

A B

SequenceofEvents

AappliesU1BappliesU2

BpublishesU2ApublishesU1

Feed=U1,U2

CappliesU2CappliesU1

WithVersioning

AappliesU1(v2)BappliesU2(v3)

BpublishesU2(v3)ApublishesU1(v2)

Feed=U1(v2),U2(v3)

CappliesU2(v3)CdiscardsU1(v2)

C

Page 217: GET /Connected: A Tutorial on Web-based Services

Caching

/products/notifications.atom

/products/notifications/{year}/{month}/{day}/{hour}.atom

/products/notifications/{entry-id}.atom

/products/product/{product-id}.xml

/products/hardware/{hardware-id}.xml

Latest

Archive

Notification

Product

Hardware

Short

Long

Long

Varies

Varies

Uri Description Caching

Page 218: GET /Connected: A Tutorial on Web-based Services

Caching the Bus

CachingProxyService

Client

Client

Client

Page 219: GET /Connected: A Tutorial on Web-based Services

Caching Dilemma

Publisher controls freshness of data

Efficientuseofnetworkresources

LowTTLHighTTL

Page 220: GET /Connected: A Tutorial on Web-based Services

Remember Cache Channels?

Mark Nottingham, Yahoo •  http://www.mnot.net/cache_channels/

Use Atom to extend the freshness of cached responses

Response

Cache-Control: max-age=60, channel="http://example.com/products/channel/indeatom", channel-maxage

Responseremainsfreshaslongas:•  Cachepollschannelatleastasoftenas"precision"

specifiedbychannel•  Channeldoesn’tissuestaleevent

May have to cache

this!

Page 221: GET /Connected: A Tutorial on Web-based Services

Security

Page 222: GET /Connected: A Tutorial on Web-based Services

Good Ole’ HTTP Authentication

•  HTTP Basic and Digest Authentication: IETF RFC 2617 •  Have been around since 1996 (Basic)/1997 (Digest) •  Pros:

–  Respects Web architecture: •  stateless design (retransmit credentials) •  headers and status codes are well understood

–  Does not prohibit caching (set Cache-Control to public) •  Cons:

–  Basic Auth must be used with SSL/TLS (plaintext password) –  Not ideal for the human Web – no standard logout –  Only one-way authentication (client to server)

222

Page 223: GET /Connected: A Tutorial on Web-based Services

HTTP Basic Auth Example

1.  Initial HTTP request to protected resource GET /index.html HTTP/1.1 Host: example.org

2.  Server responds with HTTP/1.1 401 Unauthorized WWW-Authenticate: Basic realm=”MyRealm”

3.  Client resubmits request GET /index.html HTTP/1.1 Host: example.org Authorization: Basic Qm9iCnBhc3N3b3JkCg==

Further requests with same or deeper path can include the additional Authorization header preemptively

Page 224: GET /Connected: A Tutorial on Web-based Services

HTTP Digest Difference •  Server reply to first client request:

HTTP/1.1 401 Unauthorized WWW-Authenticate: Digest

[email protected], qop="auth,auth-int”, nonce=”a97d8b710244df0e8b11d0f600bfb0cdd2”, opaque=”8477c69c403ebaf9f0171e9517f347f2”

•  Client response to authentication challenge: Authorization: Digest

username="bob", [email protected], nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093", uri="/index.html", qop=auth, nc=00000001, cnonce="0a6f188f", response=”56bc2ae49393a65897450978507ff442", opaque="8477c69c403ebaf9f0171e9517f347f2"

Page 225: GET /Connected: A Tutorial on Web-based Services

Unhealthy Cookies?

•  Form-based authentication on the human Web uses cookies

•  Can be used on the programmatic Web – POST to the authentication URL

•  Server can (should!) inform client about intended cookie lifetime

•  Cookie value often used as key to server session state –  Breaks stateless constraint –  Solution that does not require server side session

state: http://cookies.lcs.mit.edu/pubs/webauth:tr.pdf

Page 226: GET /Connected: A Tutorial on Web-based Services

SSL / TLS

•  “Strong” server and optional client authentication, confidentiality and integrity protection

•  The only feasible way to secure against man-in-the-middle attacks

•  Not broken! Even if some people like to claim otherwise

•  Not cache friendly, even using ‘null’ encryption mode

•  Performance and security becomes difficult

Page 227: GET /Connected: A Tutorial on Web-based Services

OpenID

•  OpenID is a decentralised framework for digital identities –  Not trust, just identity!

•  Your have an OpenID provider or one is provided for you –  It has a URI

•  Services that you interact with will ask for that URI •  Your OpenID provider will either:

–  Accept the request for processing immediately –  Ask whether you trust the requesting site (e.g. via email with

hyperlinks) •  Once your OpenID server OK’s the login, then you are

authenticated against the remote service –  With your canonical credentials Authenticating doesn’t

mean you’re authorised to do anything!

This is not a trust system!

Page 228: GET /Connected: A Tutorial on Web-based Services

OpenId Workflow

MasterCard Online Merchant

1. Send OpenID URL

2. Redirect to Identity Provider

3. Present OpenID credentials (usually username and password)

4. Redirect to Relying Party

with security token

5. Present security token

Relying Party Identity Provider

Page 229: GET /Connected: A Tutorial on Web-based Services

Not-So-OpenID

•  There’s no trust between OpenID providers

•  Your Web service might not accept my OpenID provider –  In general it won’t!

•  Trusted providers centralise control –  Against the philosophy of decentralised ID!

•  Federated providers won’t interoperate –  Need a hybrid “signing” model like CAs?

Page 230: GET /Connected: A Tutorial on Web-based Services

OAuth

•  Web-focused access delegation protocol •  Give other Web-based services access to some of

your protected data without disclosing your credentials

•  Simple protocol based on HTTP redirection, cryptographic hashes and digital signatures

•  Extends HTTP Authentication as the spec allows –  Makes use of the same headers and status codes –  These are understood by browsers and programmatic

clients •  Not dependent on OpenID, but can be used together

Page 231: GET /Connected: A Tutorial on Web-based Services

Why OAuth?

Page 232: GET /Connected: A Tutorial on Web-based Services

OAuth Workflow

Insurance Provider

Insurance Broker

1. Request broker to obtain existing insurance policies from insurance provider

4. Redirect to insurance provider with authorisation token

5. Log in to insurance provider and

supply authorisation token

6. Authorise broker access to existing policies

2. Request insurance policies Consumer Service Provider

User

3. Reject with authorisation token

Page 233: GET /Connected: A Tutorial on Web-based Services

OAuth Messages (1)

1.  Alice (the User) has accounts on both the insurance broker and provider’s Web sites

2.  The insurance broker (Consumer) has registered itself at the insurance company and has a Consumer Key and Secret

3.  Alice logs in to the broker and requests it to obtain her existing policies from the provider

4.  Broker request to Insurance Provider: GET /alice/policies HTTP 1.1 Host: insurance.org

5.  Insurance provider’s response: 401 Unauthorized WWW-Authenticate: OAuth realm="http://insurance.org/"

Page 234: GET /Connected: A Tutorial on Web-based Services

OAuth Messages (2)

6.  Broker requests authorisation token from Provider: POST /request_token oauth_consumer_key=abc&oauth_nonce=39kg&oauth_ ...

7.  Provider sends authorisation token in response body: 200 OK oauth_token=xyz&oauth_token_secret=abc

8.  Broker redirects Alice to Provider in response to her request: 302 Redirect Location: http://insurance.org/authorise?oauth_token=

xyz&oauth_callback=http%3A%2F%2Fbroker.org&…

9.  Alice logs in to Insurance Provider using her credentials at that site (the Broker never sees these) and authorises the Broker to access her existing policies for a defined period of time.

Page 235: GET /Connected: A Tutorial on Web-based Services

OAuth Messages (3) 10.  Insurance Provider redirects Alice to the callback URL:

302 Redirect Location: http://broker.org/token_ready?oauth_token=xyz

11.  Broker knows Alice approved, it asks Provider for Access Token: GET /accesstoken?oauth_consumer_key=abc&oauth_token=xyz Host: insurance.org

12.  The Insurance Provider sends back the Access Token: 200 Success oauth_token=zxcvb

13.  Broker creates hash or signature using access token, nonce, timestamp, Consumer Key and Secret (and more): GET /alice/policies HTTP 1.1 Host: insurance.org Authorization: OAuth realm=“http://insurance.org/”,

oauth_signature=“…”, oauth_consumer_key=“abc”, …

Page 236: GET /Connected: A Tutorial on Web-based Services

Web-Friendly Security

•  Remember that HTTPS isn’t very Web-friendly –  You can’t cache it

•  Can instead encrypt cacheable resource representations –  E.g. Atom entry elements

•  Safe and cacheable! •  Allows scalable, secure, pub/sub and

more!

Page 237: GET /Connected: A Tutorial on Web-based Services

WS-* Wars

Page 238: GET /Connected: A Tutorial on Web-based Services

What about WS-* ?

•  SOAP was a disruptive technology in 2000 • Made heterogeneous integration possible

–  And commoditised the whole market

•  But since then SOAP and WS-* have taken quite a bashing!

Page 239: GET /Connected: A Tutorial on Web-based Services

Richardson Model Level 0 •  Single well-known endpoint

–  Not really URI friendly

•  Doesn’t understand HTTP –  Other than as a transport

•  No hypermedia –  But has some features for

describing protocols: BPEL, WS-Chor, SSDL

Page 240: GET /Connected: A Tutorial on Web-based Services

The Web Services Stack

Routing

Service

Process Management (Workflow)

Network Layer SOAP

WS-Addressing

WS-Security

WS-C

oordination

WS-Transaction

Application-Specific Code

BPEL

WS-Choreography

WSD

L

WS-Policy

WS-M

EX

WS-Trust

WS-Federation

WS-SecureConversation

Page 241: GET /Connected: A Tutorial on Web-based Services

SOAP

•  SOAP has: –  Well-defined envelope –  Well-defined processing model

• Encompasses sender, receiver, and intermediaries

•  It is a transfer protocol –  And treats everything underneath as a

transport –  Including HTTP!

Page 242: GET /Connected: A Tutorial on Web-based Services

Envelopes POST /orders HTTP/1.1 Host: restbucks.com Content-Type: application/xml Content-Length: 3206

<Order xmlns="http://..." …/>

<soap:Envelope xmlns:soap="http://...">

<soap:Header> <wsa:To

xmlns:wsa="http://...">http://restbucks.com/order </wsa:To>

</soap:Header> <soap:Body> <Order

xmlns="http://..." .../> </soap:Body> </soap:Envelope>

Page 243: GET /Connected: A Tutorial on Web-based Services

WS-Security

• Handles the cryptographic aspects of a message exchange –  In accordance with the service metadata

• SecurityPolicy (see later)

•  Provides overarching framework to allow various secure hashing, signature, and shared/asymmetric encryption algorithms

Source: Dr. Halvard Skogsrud

Page 244: GET /Connected: A Tutorial on Web-based Services

Dependencies

• Uses a number of allied specifications (known as profiles) for handling different credential types –  Keberos tickets, –  X.509 certificates –  SAML assertions

•  Profiles are maintained by the same standards body committee as the main WS-Security specification (OASIS)

Page 245: GET /Connected: A Tutorial on Web-based Services

What’s new in Web service security? •  Since the underlying transport can vary, even along a single message path, the

security solution should be independent of the transport mechanism •  In other words, we can only assume that SOAP is used •  Also, complex interactions (beyond request/response) is possible, and this

requires a richer framework

Page 246: GET /Connected: A Tutorial on Web-based Services

Specification Stack

• WS-* security related standards and specifications stack:

SOAP

WS-Policy WS-Trust

WS-SecureConversation

WS-Federation

WS-Security

WS-Policy Attachment

WS-Policy Assertions

WS- Security Policy

Page 247: GET /Connected: A Tutorial on Web-based Services

Secure Messaging Architecture •  (Optional) key exchange phase

–  May be out of band or as part of a higher-order protocol like WS-SecureConversation)

–  Key exchange usually uses asymmetric encryption to transfer a shared key

–  Subsequently used to encrypt message transfers. –  Other approaches supported, including a purely asymmetric key

•  SOAP messages signed and/or encrypted as they leave the sending Web Service. As message is received, the message’s signature is verified and/or its payload is decrypted

Page 248: GET /Connected: A Tutorial on Web-based Services

Web Security

•  Recall: –  HTTPs

• Bilateral certificate exchange is quite secure!

–  OpenID –  Oauth

•  SAML has Web bindings too •  Advantages:

–  Well known and understood –  Less places for security problems to hide!

Page 249: GET /Connected: A Tutorial on Web-based Services

Transactions

•  Transactions ensure consistency across Web services involved in an application

•  Have been a total of 3 major competing standards! –  OASIS BTP –  MS/IBM/BEA WS-Transaction –  Arjuna/Sun/Oracle/etc WS-CAF

•  Now single unified standards –  WS-Coordination –  WS-AtomicTransaction –  WS-BusinessActivity

Page 250: GET /Connected: A Tutorial on Web-based Services

WS-C/AT/BA Layered Model •  WS-Coordination provides the

basic context and context lifecycle features

•  WS-AtomicTransaction layers on a strict 2.5PC protocol

•  WS-BusinessActivity layers on a relaxed isolation 2 PC protocol

WS-Coordination

WS-BusinessActivity

WS-AtomicTransaction

Context provides a shared session kind of abstraction

across Web Services

Just for interop between TP monitors like CICS and

Tuxedo

Compensating model for business transactions

Page 251: GET /Connected: A Tutorial on Web-based Services

Modelling Classic Transactions on the Web

Page 252: GET /Connected: A Tutorial on Web-based Services

No Need for Transactions at Web Scale

•  Transactions are the only way of getting end-to-end reliability in a distributed system –  But requires a trusted coordinator!

•  Transactions reduce scalability because they lock/reserve resources –  Not very Web-friendly!

•  The Web has coordination baked in! –  Use status codes after each action instead of

classic transactional coordination

Page 253: GET /Connected: A Tutorial on Web-based Services

Reliable Messaging

•  WS-ReliableMessaging offers four schemes for reducing message delivery errors between Web Services: –  At most once – Duplicate messages will not be delivered,

but messages may still be dropped; –  At least once – Every message will be delivered, but

duplication may occur; –  Exactly once – Every message will be delivered once and

once only; –  In order – Messages will be received in the same order

they were sent. •  These are good principles, that we’d like to uphold in

most distributed systems

Page 254: GET /Connected: A Tutorial on Web-based Services

A Reliable Session

Page 255: GET /Connected: A Tutorial on Web-based Services

Reliable Interactions on the Web •  The Web deals with many of the same requirements as WS-

ReliableMessaging –  Does so using HTTP verbs, headers, and status codes to coordinate

interactions and implement retries •  At most once

–  PUT/GET/DELETE until success •  At least once

–  POST/PUT/GET/DELETE until success •  Exactly once

–  PUT/GET/DELETE until success •  In order

–  Implicit: the Web is synchronous! •  POST can be much more troublesome

–  Use POST Once Exactly, a header based token scheme

Page 256: GET /Connected: A Tutorial on Web-based Services

Epilogue

Page 257: GET /Connected: A Tutorial on Web-based Services

Web Architecture

• Ubiquitous, global on-ramp •  Connects everything to everything, based

on URI-addressable resources –  With a uniform interface

•  Also provides standard coordination mechanism –  Status codes!

•  And is ambivalent about content • Media types!

Page 258: GET /Connected: A Tutorial on Web-based Services

URI Tunnelling

• Map URIs to methods and GET those URIs –  Easy, ubiquitous

• Not very Web-friendly –  Breaks expectations –  Remember the library of congress incident?

Page 259: GET /Connected: A Tutorial on Web-based Services

POX

•  Treats HTTP as a synchronous transport protocol –  Great because it gets through firewalls

•  But again breaks expectations –  HTTP is not MOM!

•  Misses out on all the good stuff from the Web –  Status codes for coordination –  Caching for performance –  Loose coupling via hypermedia –  Etc

•  Not as good as proper message-oriented middleware –  Which are low-latency, reliable, etc.

Page 260: GET /Connected: A Tutorial on Web-based Services

CRUD Services

•  The simplest kind of Web-based service •  Embraces HTTP and Web infrastructure

–  Four verbs, status codes, formats –  Cacheable!

•  Can easily describe them –  URI templates –  WADL

•  But tightly couples client and server –  Might not be a problem in some domains

Page 261: GET /Connected: A Tutorial on Web-based Services

Hypermedia

•  It’s all about links! –  Describe state machines with lots of lovely links

•  Constrain what you can do to resources with the uniform interface

•  And describe formats with media types –  hypermedia formats, because of links!

•  Loosely coupled –  The server mints URIs to resources, clients follow them –  Easily spans systems/domains (URIs are great!)

•  Embraces the Web for robustness –  Verbs, status codes, caching

Page 262: GET /Connected: A Tutorial on Web-based Services

Scalability

•  Everything you know still applies –  Stateless is good –  Horizontal is good

•  Yet everything you know no longer applies! –  Text-based synchronous protocol is scalable???

•  Do as little work as possible –  Make interactions conditional

• ETags and if-modified etc are your friends

•  And cache!

Page 263: GET /Connected: A Tutorial on Web-based Services

Atom and AtomPub

•  Atom is format that describes list of things –  In terms of feeds and entries

•  AtomPub is a protocol defined in terms Atom entries and links

•  Together they can be used for very scalable pub/sub –  But latency is very high compared to enterprise

pub/sub –  Caching improves scalability

–  But improves latency

Page 264: GET /Connected: A Tutorial on Web-based Services

Security

• HTTPS is still our friend! –  But it inhibits caching

• OpenID support waning on the human Web

• OpenAuth still finding its feet • Other approaches like SAML, mature but

yet to be widely deployed –  Will the programmatic Web drive this?

Page 265: GET /Connected: A Tutorial on Web-based Services

Questions? (Our book has all the answers, coming in early 2010!)