The New Guardian

44
Stephen Dunn, head of technology strategy Mat Wall & Phil Wills, technical architects Rebuilding guardian.co.uk

description

Stephen Dunn, Mat Wall and Phil Wills talk about The Guardian platform at OpenTech 2008 -- how we are using open source technologies, the open source tools that have been released by developers who worked on the recent rebuild; how our use of REST is creating hackable feeds, and is allowing services and data from the web to build into our platform.

Transcript of The New Guardian

Page 1: The New Guardian

Stephen Dunn, head of technology strategy

Mat Wall & Phil Wills, technical architects

Rebuilding guardian.co.uk

Page 2: The New Guardian

Online since 1996

1.5M pages and counting

181M pages/month (Jan 08)

19.7M visitors (Jan 08)

3x Webby award winner

guardian.co.uk1999 - 2007

Page 3: The New Guardian

Online since 1996

1.5M pages and counting

181M pages/month (Jan 08)

19.7M visitors (Jan 08)

3x Webby award winner

guardian.co.uk1999 - 2007

Page 4: The New Guardian

Online since 1996

1.5M pages and counting

181M pages/month (Jan 08)

19.7M visitors (Jan 08)

3x Webby award winner

guardian.co.uk1999 - 2007

Page 5: The New Guardian

Online since 1996

1.5M pages and counting

181M pages/month (Jan 08)

19.7M visitors (Jan 08)

3x Webby award winner

guardian.co.uk1999 - 2007

Page 6: The New Guardian

Relaunch, 2007-2008

Page 7: The New Guardian

Relaunch, 2007-2008

Page 8: The New Guardian

Relaunch, 2007-2008

Page 9: The New Guardian

Relaunch, 2007-2008

Page 10: The New Guardian

Relaunch, 2007-2008

Page 11: The New Guardian

Could not redesign

Thousands of templates with content and presentation mixed

Editors manually created most navigation

Platform unable to:

create open services around content

connect to other services on the web

No strong information model beyond “stuff on a page”

What was wrong

Page 12: The New Guardian

Resources should be:

permanent

addressable

discoverable

New web, new rules

Page 13: The New Guardian

“A cool URI is one that does not change” Tim Berners-Lee 1998

1.5 million resources redirected to new URL scheme

Permanent

http://www.flickr.com/photos/fstorr/

Page 14: The New Guardian

AddressableResources are “about” something and there is a resource for everything meaningful - ready for the social web, e.g.: “share this” shares something meaningful

The “age of point-at-things” (Coates 2005)

Page 15: The New Guardian

Usability testing showed

Browsing the web most people don’t really know which site they are on, let alone the section of the site.

Discoverable

Page 16: The New Guardian

Multiple routes to content

Most content is multifaceted

Wherever there is more than one piece of content gathered - offer it as a feed

Discoverable

Page 17: The New Guardian

Multiple routes to content

Most content is multifaceted

Wherever there is more than one piece of content gathered - offer it as a feed

Discoverable

Page 18: The New Guardian

iraq

guardian

sportworld news media

usa iraq tenniscricket rugby tvradio press

Page 19: The New Guardian

iraq

guardian

sportworld news media

usa iraq tenniscricket rugby tvradio press

Page 20: The New Guardian

Automatically generate pages and navigation

Still under editorial control

Routes to content

Page 21: The New Guardian

Automatically generate pages and navigation

Still under editorial control

Routes to content

Page 22: The New Guardian

Automatically generate pages and navigation

Still under editorial control

Routes to content

Page 23: The New Guardian

Automatically generate pages and navigation

Still under editorial control

Routes to content

Page 24: The New Guardian

Automatically generate pages and navigation

Still under editorial control

Routes to content

Page 25: The New Guardian

Automatically generate pages and navigation

Still under editorial control

Routes to content

Page 26: The New Guardian

Pages are responsible for aggregating the core resource and other data for our readers

Pages are associated with templates which provide layout for the data

Pages can be rendered in more than one way

Our architecture: pages

Page 27: The New Guardian

The addressable resource is the content

We only allow one core resource per URL

We can have multiple types of content

We also need the to create pages that aggregate many resources

url

template

Page

Content

Article Video

1

1

Our architecture: pages

Page 28: The New Guardian

The world is not a hierarchy

Tags categorise content

Tags provide many routes to content

Tags drive syndication

Our architecture: tags

Page 29: The New Guardian

Our architecture: tags

We can have many types of tag

Tag pages can automatically aggregate related content

Folders provide alternate taxonomy for groups of tags

Associations freely managed by editors

url

Page

Content

1

1

Tag **

1

1

Keyword Author etc.

Folder **

Page 30: The New Guardian

Tags drive advertising and 3rd party integration

Our architecture: tags

Page 31: The New Guardian

Tags drive advertising and 3rd party integration

Our architecture: tags

Page 32: The New Guardian

Tags drive advertising and 3rd party integration

Our architecture: tags

Page 33: The New Guardian

External information

Page 34: The New Guardian

Traditional feed: use database as integration point

Model and store external entities in internal database

External information

Requires knowledge of external business logic

Complex and expensive to build

Impedance mismatch: does not make use of the resource oriented nature of the web

Database

App server

Feed

Web server

External system

net

Page 35: The New Guardian

External information

Page 36: The New Guardian

External information

Page 37: The New Guardian

Delegate as much understanding & management of data as possible

Model should be resource oriented. External information should be “attached” rather than persisted.

Use simplest possible caching for external information

Business relationship should be partnership, not supplier / consumer

Using the web

Page 38: The New Guardian

Database

App server

Web server

External system

netProxy

Using the webSimple change of not using the database as integration point reduces the cost of integration

No longer need to fully understand external business logic

No longer need to clear cache

Information more up to date

Page 39: The New Guardian

How we see football

Match

Home Team Away Team

Tournament

Page 40: The New Guardian

How we see football

Match

Home Team Away Team

Tournament

Tag Content

Page 41: The New Guardian

Resource oriented model

Match

Home Team Away Team

Tournament

/football/match/2008/12345

/football/tournament/15

/football/team/500/football/team/31

App serverExternal

systemnetProxy

/football/match/2008/04/all

/football/tournament/15/teams/football/tournament/15/matches

Page 42: The New Guardian

Match

Home Team Away Team

Tournament

External information

url

template

Page

Content

1

1

Tag **

1

1

type

token

External Info

* *

**

1

1

Persist the minimum information

Page 43: The New Guardian

Prepared to develop simple custom API

Reliable on the web.

Enter into SLA

Small supplier may proxy larger suppliers API

Shopping for suppliers

Page 44: The New Guardian

Thanks.

Join us...http://www.gnmcareers.co.uk

http://blogs.guardian.co.uk/inside/