George James :: Querying The Web

35
Querying the Web Out of the Slipstream :: September 27, 2007

description

 

Transcript of George James :: Querying The Web

Page 1: George James :: Querying The Web

Querying the Web

Out of the Slipstream :: September 27, 2007

Page 2: George James :: Querying The Web

Querying the Web

“Information wants to be free” Stewart Brand, Whole Earth Catalogue May 1985

“If the new computer set up allowed folks inside to be more creative and independent, why not open it up to outsiders, too?”

Jeff Bezos, Amazon March 2002

“Data is the Next Intel Inside” Tim O’Reilly September 2005

Open Source has commoditized software Creative Commons will commoditize information Which leaves servers, services and service…

Page 3: George James :: Querying The Web

General Medical Council

Page 4: George James :: Querying The Web

General Medical Council

Page 5: George James :: Querying The Web

General Medical Council

Page 6: George James :: Querying The Web

Freebase

Page 7: George James :: Querying The Web

Freebase

Page 8: George James :: Querying The Web

Freebase

Page 9: George James :: Querying The Web

Freebase

Page 10: George James :: Querying The Web

Freebase

Metaweb Query Language Request:

{ "type" : "/medicine/physician",

"name" : “Michael Maher“ } Response:

{ "code": "/api/status/ok", "result": { "type": "/medicine/physician", "name": “Michael Maher", “gender”: “Male”,

“education”: “Leeds University”}

}

JSON

Page 11: George James :: Querying The Web

Freebase User sourced content API Extensible, dynamic Creative Commons / PD Automatic right to use

Stepwise refinement

GMC Authoritative Website based search Static Restrictive license Even if you pay for the

data you still cannot use it, legally.

Periodic updates

GMC vs Freebase

Page 12: George James :: Querying The Web

REST

REpresentational State Transfer Less rigourous equivalent of SOAP Data are considered to be resources Every resource has a unique address Layered over http:

Client/Server separation Stateless Cacheable

Request:GET http://rest.georgejames.com/product/Serenji/

Response:Name=Serenji

Price=195.00

OrderCode=H1001

Page 13: George James :: Querying The Web

Amazon S3

S3 :: Simple Storage Service Online storage space $0.15 per Gbyte per month for storage ~ $0.20 per Gbyte data transfer

Storage request:PUT http://s3.amazonaws.com/[bucket-name]/[key-name]

Retrieval request:GET http://s3.amazonaws.com/[bucket-name]/[key-name]

EC2 :: Elastic Compute Clouds

Page 14: George James :: Querying The Web

Microformats

Page 15: George James :: Querying The Web

Microformats

Without Microformats:<div class=‘opaque’> Out of the Slipstream is a one-day conference on

Thursday 27 September 2007 at Brooklands Museum, Surrey, UK.

</div>

With Microformats:<div class=‘opaque vevent’> <span class='summary'>Out of the Slipstream</span>

is a one-day conference on <abbr class="dtstart" title="20070927"> Thursday 27 September 2007 </abbr> at Brooklands Museum, Surrey, UK. </div>

Page 16: George James :: Querying The Web

Microformats

Page 17: George James :: Querying The Web

Astoria

Page 18: George James :: Querying The Web

Astoria in action

Request:http://astoria.sandbox.live.com/northwind/northwind.rse/Categories

Response:

Page 19: George James :: Querying The Web

Astoria in action

Request:http://astoria.sandbox.live.com/northwind/northwind.rse/Customers

Response:

Page 20: George James :: Querying The Web

Astoria in action

Request:/Customers[FRANK]

Response:

Page 21: George James :: Querying The Web

Astoria in action

Request:/Customers[FRANK]/Orders

Response:

Page 22: George James :: Querying The Web

Astoria in action A variety of response formats:

POX Web3S (Web, Structured, Schema’d and Searchable) ATOM JSON

JSON request:/Customers[FRANK]?$format=json

Response:

Page 23: George James :: Querying The Web

Astoria is still evolving

Ongoing discussion about the format of requests: /Customers!’FRANK’ /Customers!’FRANK’/Orders!10267 /Customers!CustomerID=‘FRANK’ /Customers(‘FRANK’) /Customers(‘FRANK’)/Orders(10267)

Qualifiers control the response format: /Customers(‘FRANK’)/CustomerName /Customers(‘FRANK’)/CustomerName/$value /Customers(‘FRANK’)/$format=json /Customers/$skip=30&$take=10

Currently being Microsoftened…

Page 24: George James :: Querying The Web

Where is all this information going to come from?

Page 25: George James :: Querying The Web

Crowdsourcing

Jeff Howe, Wired Magazine, June 2006 Delegating an activity to a large number of

unidentified individuals Small finite tasks Quantity more important than quality The sum is greater than the parts Examples:

Wikipedia

Page 26: George James :: Querying The Web

Crowdsourcing

Page 27: George James :: Querying The Web

Crowdsourcing

Page 28: George James :: Querying The Web

Google Maps

Page 29: George James :: Querying The Web

Google Maps

Page 30: George James :: Querying The Web

Crowdsourcing

Jeff Howe, June 2006, Wired Magazine Delegating an activity to a large number of unidentified

individuals Small finite tasks Quantity more important than quality The sum is greater than the parts

Examples: Wikipedia Galaxy Zoo Amazon Mechanical Turk Google route planner

Consequences: Drives down the cost of data Ownership may not be the traditional incubents Client / user needs to discriminate

Page 31: George James :: Querying The Web

The Power of Information Review Commissioned by the Cabinet Office, published in June

2007, to review and advise on the use of public sector information.

Recommendation 9:By Budget 2008, government should commission and publish an independent review of thecosts and benefits of the current trading fund charging model for the re-use of public sector information, including the role of the five largest trading funds, the balance of direct versus downstream economic revenue, and the impact on the quality of public sector information.

US: Public Domain UK: Crown Copyright

Page 32: George James :: Querying The Web

AND - Automotive Navigation Data

Press release:July 4, 2007Rotterdam - AND Automotive Navigation Data hasagreed ... to donate digital maps of the Netherlands, China and India to the community.

Page 33: George James :: Querying The Web

More ways of querying the web

Google Search Google Events Google Base Yahoo! Pipes RSS – Really Simple Syndication KML BBC Backstage

Page 34: George James :: Querying The Web

The Internet is the Database

Page 35: George James :: Querying The Web

Thank you

Questions?