The New Guardian

Post on 17-Nov-2014

2.643 views 0 download

Tags:

description

Stephen Dunn, Mat Wall and Phil Wills talk about The Guardian platform at OpenTech 2008 -- how we are using open source technologies, the open source tools that have been released by developers who worked on the recent rebuild; how our use of REST is creating hackable feeds, and is allowing services and data from the web to build into our platform.

Transcript of The New Guardian

Stephen Dunn, head of technology strategy

Mat Wall & Phil Wills, technical architects

Rebuilding guardian.co.uk

Online since 1996

1.5M pages and counting

181M pages/month (Jan 08)

19.7M visitors (Jan 08)

3x Webby award winner

guardian.co.uk1999 - 2007

Online since 1996

1.5M pages and counting

181M pages/month (Jan 08)

19.7M visitors (Jan 08)

3x Webby award winner

guardian.co.uk1999 - 2007

Online since 1996

1.5M pages and counting

181M pages/month (Jan 08)

19.7M visitors (Jan 08)

3x Webby award winner

guardian.co.uk1999 - 2007

Online since 1996

1.5M pages and counting

181M pages/month (Jan 08)

19.7M visitors (Jan 08)

3x Webby award winner

guardian.co.uk1999 - 2007

Relaunch, 2007-2008

Relaunch, 2007-2008

Relaunch, 2007-2008

Relaunch, 2007-2008

Relaunch, 2007-2008

Could not redesign

Thousands of templates with content and presentation mixed

Editors manually created most navigation

Platform unable to:

create open services around content

connect to other services on the web

No strong information model beyond “stuff on a page”

What was wrong

Resources should be:

permanent

addressable

discoverable

New web, new rules

“A cool URI is one that does not change” Tim Berners-Lee 1998

1.5 million resources redirected to new URL scheme

Permanent

http://www.flickr.com/photos/fstorr/

AddressableResources are “about” something and there is a resource for everything meaningful - ready for the social web, e.g.: “share this” shares something meaningful

The “age of point-at-things” (Coates 2005)

Usability testing showed

Browsing the web most people don’t really know which site they are on, let alone the section of the site.

Discoverable

Multiple routes to content

Most content is multifaceted

Wherever there is more than one piece of content gathered - offer it as a feed

Discoverable

Multiple routes to content

Most content is multifaceted

Wherever there is more than one piece of content gathered - offer it as a feed

Discoverable

iraq

guardian

sportworld news media

usa iraq tenniscricket rugby tvradio press

iraq

guardian

sportworld news media

usa iraq tenniscricket rugby tvradio press

Automatically generate pages and navigation

Still under editorial control

Routes to content

Automatically generate pages and navigation

Still under editorial control

Routes to content

Automatically generate pages and navigation

Still under editorial control

Routes to content

Automatically generate pages and navigation

Still under editorial control

Routes to content

Automatically generate pages and navigation

Still under editorial control

Routes to content

Automatically generate pages and navigation

Still under editorial control

Routes to content

Pages are responsible for aggregating the core resource and other data for our readers

Pages are associated with templates which provide layout for the data

Pages can be rendered in more than one way

Our architecture: pages

The addressable resource is the content

We only allow one core resource per URL

We can have multiple types of content

We also need the to create pages that aggregate many resources

url

template

Page

Content

Article Video

1

1

Our architecture: pages

The world is not a hierarchy

Tags categorise content

Tags provide many routes to content

Tags drive syndication

Our architecture: tags

Our architecture: tags

We can have many types of tag

Tag pages can automatically aggregate related content

Folders provide alternate taxonomy for groups of tags

Associations freely managed by editors

url

Page

Content

1

1

Tag **

1

1

Keyword Author etc.

Folder **

Tags drive advertising and 3rd party integration

Our architecture: tags

Tags drive advertising and 3rd party integration

Our architecture: tags

Tags drive advertising and 3rd party integration

Our architecture: tags

External information

Traditional feed: use database as integration point

Model and store external entities in internal database

External information

Requires knowledge of external business logic

Complex and expensive to build

Impedance mismatch: does not make use of the resource oriented nature of the web

Database

App server

Feed

Web server

External system

net

External information

External information

Delegate as much understanding & management of data as possible

Model should be resource oriented. External information should be “attached” rather than persisted.

Use simplest possible caching for external information

Business relationship should be partnership, not supplier / consumer

Using the web

Database

App server

Web server

External system

netProxy

Using the webSimple change of not using the database as integration point reduces the cost of integration

No longer need to fully understand external business logic

No longer need to clear cache

Information more up to date

How we see football

Match

Home Team Away Team

Tournament

How we see football

Match

Home Team Away Team

Tournament

Tag Content

Resource oriented model

Match

Home Team Away Team

Tournament

/football/match/2008/12345

/football/tournament/15

/football/team/500/football/team/31

App serverExternal

systemnetProxy

/football/match/2008/04/all

/football/tournament/15/teams/football/tournament/15/matches

Match

Home Team Away Team

Tournament

External information

url

template

Page

Content

1

1

Tag **

1

1

type

token

External Info

* *

**

1

1

Persist the minimum information

Prepared to develop simple custom API

Reliable on the web.

Enter into SLA

Small supplier may proxy larger suppliers API

Shopping for suppliers

Thanks.

Join us...http://www.gnmcareers.co.uk

http://blogs.guardian.co.uk/inside/