1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath...

30
1 Web Standards for the Clumps Projects Brian Kelly Email Address UK Web Focus [email protected] UKOLN URL University of Bath http://www.ukoln.ac.uk/ UKOLN is funded by the British Library Research and Innovation Centre, the Joint Information Systems Committee of the Higher Education Funding Councils, as well as by project funding from the JISC’s Electronic Libraries Programme and the European Union. UKOLN also receives support from the University of Bath where it

description

3 Standardisation W3C Produces W3C Recommendations on Web protocols Managed approach to developments Protocols initially developed by W3C members Decisions made by W3C, influenced by member and public review IETF Produces Internet Drafts on Internet protocols Bottom-up approach to developments Protocols developed by interested individuals "Rough consensus and working code" ISO Produces ISO Standards Can be slow moving and bureaucratic Produce robust standards Proprietary De facto standards Often initially appealing (cf PowerPoint, PDF) May emerge as standards PNG HTML Z39.50 Java? PNG HTML Z39.50 Java? PNG HTML HTTP PNG HTML HTTP URN whois++ HTTP URN whois++ HTML extensions PDF and Java? HTML extensions PDF and Java?

Transcript of 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath...

Page 1: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

1

Web Standards for the Clumps Projects

Brian Kelly Email AddressUK Web Focus [email protected] URLUniversity of Bath http://www.ukoln.ac.uk/

UKOLN is funded by the British Library Research and Innovation Centre, the Joint Information Systems Committee of the Higher Education Funding Councils, as well as by project funding from the JISC’s Electronic Libraries Programme and the European Union. UKOLN also receives support from the University of Bath where it is based.

Page 2: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

2

Contents• Introduction• Web Standards Overview• Web Standards:

• Data Formats• Transport• Addressing

• Metadata• Distributed Searching• Collections• Authentication• Deployment Issues

Aims of Talk• To give brief overview

of web architecture• To describe

developments to web standards

• To briefly address implementation models

Page 3: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

3

Standardisation

W3C• Produces W3C

Recommendations on Web protocols

• Managed approach to developments

• Protocols initially developed by W3C members

• Decisions made by W3C, influenced by member and public review

IETF• Produces Internet

Drafts on Internet protocols• Bottom-up approach to developments• Protocols developed by

interested individuals• "Rough consensus and working

code"

ISO• Produces ISO

Standards• Can be slow moving

and bureaucratic• Produce robust

standards

Proprietary• De facto standards• Often initially appealing

(cf PowerPoint, PDF)• May emerge as

standards

PNGHTMLZ39.50Java?

PNGHTMLHTTP

HTTPURNwhois++

HTML extensionsPDF and Java?

Page 4: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

4

The Web VisionTim Berners-Lee's (and W3C's) vision for the Web:

• Evolvability is critical • Automation of information management:

If a decision can be made by machine, it should• All structured data formats should be based on XML• Migrate HTML to XML• All logical assertions to map onto RDF model• All metadata to use RDF

See keynote talk at WWW 7 conference at <URL: http://www.w3.org/Talks/1998/0415-Evolvability/slide1-1.htm>

Page 5: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

5

HTML 4.0, CSS 2.0 and DOMHTML 4.0 used in conjunction with CSS 2.0 (Cascading Style Sheets) and the DOM provides an architecturally pure, yet functionally rich environment

HTML 4.0 - W3C-Rec• Improved forms• Hooks for stylesheets• Hooks for scripting

languages• Table enhancements• Better printing

CSS 2.0 - W3C-Rec• Support for all HTML

formatting • Positioning of HTML

elements• Multiple media support

Problems• Changes during CSS development• Netscape & IE incompatibilities • Continued use of browsers with

known bugs

DOM - W3C-Rec• Document Object Model• Hooks for scripting

languages• Permits changes to

HTML & CSS properties and content

Page 6: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

6

HTML LimitationsHTML 4.0 / CSS 2.0 have limitations:

• Difficulties in introducing new elements– Time-consuming standardisation process

(<ABBREV>)– Dictated by browser vendor (<BLINK>, <MARQUEE>)

• Area may be inappropriate for standarisation:– Covers specialist area (maths, music, ...)– Application-specific (<STUD-NUM>)

• HTML is a display (output) format• HTML's lack of arbitrary structure limits

functionality:– Find all memos copied to John Smith– How many unique tracks on Jackson Browne CDs

Page 7: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

7

XMLXML:

• Extensible Markup Language• A lightweight SGML designed for network use• Addresses HTML's lack of evolvability• Arbitrary elements can be defined (<STUDENT-NUMBER>, <PART-NO>, etc)

• Agreement achieved quickly - XML 1.0 became W3C Recommendation in Feb 1998

• Support from industry (SGML vendors, Microsoft, etc.)

• Support in Netscape 5 and IE 5

Page 8: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

8

XML ConceptsWell-formed XML resources:

Make end-tags explicit: <LI>...</LI>

Make empty elements explicit: <IMG .../>Quote attributes <IMG SRC="logo" HEIGHT="20"

Use consistent upper/lower case

Valid XML resources: Need DTD

XML Namespaces:Mechanism for ensuring unique XML elements:<?xmlns:FOO="http://foo.org/1998-001" prefix="i"><P>Insert <i:PART>M-471</i:PART></P>

Page 9: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

9

XML DeploymentAriadne issue 15 has article on "What Is XML?"Describes how XML support can be provided:

• Natively by new browsers• Back end conversion

of XML - HTML• Client-side conversion

of XML - HTML / CSS• Java rendering of XML

Examples of intermediaries

See http://www.ariadne.ac.uk/issue15/what-is/

Page 10: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

10

XLink, XPointer and XSLXLink will provide sophisticated hyperlinking missing in HTML:

• Links that lead user to multiple destinations• Bidirectional links• Links with special behaviors:

– Expand-in-place / Replace / Create new window– Link on load / Link on user action

• Link databasesXPointer will provide access to arbitrary portions of XML resourceXSL stylesheet language will provide extensibility and transformation facilities (e.g. create a table of contents)

EnglandFrance

<commentary xml:link="extended" inline="false"> <locator href="smith2.1" role="Essay"/> <locator href="jones1.4" role="Rebuttal"/> <locator href="robin3.2" role="Comparison"/> </commentary>

Page 11: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

11

XML UpdateData / Schemas

XML-Data: Submitted to W3C Jan 98 (Obsolete?)Document Content Description: Submitted Aug 98XSchema: Independent effort

Programming InterfaceDOM level 1: W3C Recommendation, May 98

Style & PresentationCSS level 2: W3C Recommendation, May 98Extensible Style Language: Working Draft, Aug 98

Relationship to Other ResourcesXLink , XPointer: Working Drafts, Mar 98XML Namespaces: Working Draft, Aug 98

Query LanguagesXML Query Language: Submitted to W3C Aug 98XQL: Independent effort

Page 12: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

12

AddressingURLs (e.g. http://www.bristol-poly.ac.uk/depts/music/) have limitations:

• Lack of long-term persistency– Organisation changes name– Department shut down or merged– Directory structure reorganised

• Inability to support multiple versions of resources (mirroring)

URNs (Uniform Resource Names):• Proposed as solution• Difficult to implement (no W3C activity in this

area)

Page 13: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

13

Addressing - SolutionsDOIs (Document Object Identifiers):

• Proposed by publishing industry as a solution• Aimed at supporting rights ownership• Business model needed

PURLs (Persistent URLs):• Provide single level of redirection

Pragmatic Solution:• URLs don't break - people break them• Design URLs to have long life-span

Further information:<URL: http://www.ukoln.ac.uk/metadata/resources/urn/><URL: http://hosted.ukoln.ac.uk/biblink/wp2/

links.html>

Page 14: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

14

TransportHTTP/0.9 and HTTP/1.0:

Design flaws and implementation problems

HTTP/1.1: Addresses some of these problems 60% server support Performance benefits! (60% packet traffic reduction) Is acting as fire-fighter Not sufficiently flexible or extensible

HTTP/NG: Radical redesign using object-oriented technologies Undergoing trials Gradual transition (using proxies) Integration of application (distributed searching?)

Page 15: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

15

MetadataMetadata - the missing architectural component from the initial implementation of the web

Metadata -RDF

PICS, TCN,

MCF, DSig,

DC,...

AddressingURL

Data formatHTML

TransportHTTP

Metadata Needs:• Resource discovery• Content filtering• Authentication• Improved navigation• Multiple format support• Rights management

Page 16: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

16

Metadata ExamplesDSig (Digital Signatures initiative):

• Key component for providing trust on the web• DSig 2.0 will be based on RDF and will support signed

assertion:– This page is from the University of Bath– This page is a legally-binding list of courses provided

by the University

P3P (Platform for Privacy Preferences):• Developing methods for exchanging Privacy Practices of

Web sites and userNote that discussions about additional rights management metadata are currently taking place

Page 17: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

17

SitemapsSitemaps provide navigational alternatives to browsing a site by following links.

Configurable site maps will enable end users to define hierarchies

http://www.elsop.com/linkscan/map.html

Page 18: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

18

RDFRDF (Resource Description Framework):

• Highlight of WWW 7 conference• Provides a metadata framework ("machine understandable

metadata for the web")• Based on ideas from content rating (PICS), resource

discovery (Dublin Core) and site mapping (MCF)• Applications include:

– cataloging resources – resource discovery– electronic commerce – intelligent agents– digital signatures – content rating– intellectual property rights – privacy

• See <URL: http://www.w3.org/Talks/1998/0417-WWW7-RDF>

Page 19: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

19

RDF ModelRDF:

• Based on a formal data model (direct label graphs)

• Syntax for interchange of data

• Schema model

Resource ValuePropertyType

Property

page.html £0.05Cost

11-May-98ValidUntil

RDF Data Model

page.html £0.05

11-May-98

Property

Cost

InstanceOf

ValidUntil

ValuePropObj

Cost

PropName

Page 20: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

20

RDF Example Example of Dublin Core metadata in RDF<?xml:namespace ns="http://www.w3.org/TR/WD-rdf/" prefix="rdf"?>

<?xml:namespace ns="http://purl.org/dublin_core/schema/" prefix="dc"?>

<rdf:RDF> <rdf:Description RDF:HREF="page.html"> <dc:Creator>John Smith</dc:Creator> <dc:Title>John’s Home Page</dc:Title> </rdf:Description></rdf:RDF>

Page 21: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

21

Browser Support for RDFMozilla (Netscape's source code release) provides support for RDF.Mozilla supports site maps in RDF, as well as bookmarks and history lists See Netscape's or HotWired home page for a link to the RDF file.

Trusted 3rd

Party Metadata

Embedded Metadata

e.g. sitemaps

Image from http://purl.oclc.org/net/eric/talks/www7/devday/

Page 22: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

22

RDF Conclusion RDF is a general-purpose framework RDF provides structured, machine-

understandable metadata for the Web Metadata vocabularies can be developed

without central coordination Role for eLib projects in defining schemas?

RDF Schemas describe the meaning of each property name

Signed RDF is the basis for trust

Page 23: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

23

Distributed SearchingDistributed searching important for the DNER (Distributed National Electronic Resource)

ROADS prototype provides cross-searching using whois++

http://prospero.ahds.ac.uk:8080/ahds_live/

AHDS prototype provides cross-searching using Z39.50

Page 24: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

24

How Metadata Could Be UsedDatabase Description

• Music resources, including ...

Policy (Terms & Conditions / Resource and Service)• For licensing reasons, access is restricted to authorised HEIs• For performance reasons, access restricted between 9-17.00• The service logo must be included in results set, unless results

only come from service• Permission for cross-searching restricted to other eLib projects• You're only allowed to link to the main entry point

Individual• Give me HTML or PDF resources, not Word, …• I'm blind. Include ACSS in results and deliver a sitemap

Client Software • My browser doesn't support XML,so send me HTML

Issues:• Loss of visibility• Performance, ..

Page 25: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

25

Collection Description WorkCollection Description Group:

• UKOLN involvement in producing list of attributes for collection level description (in the library, museum, archival sense), which includes databases of Internet resource descriptions such as SOSIG.

• Work of interest to clumps and hybrid libraries.• WG membership: Dan Brickley (ROADS, ILRT),

Andy Powell (ROADS), Verity Brack (RIDING), Matthew Dovey (Music Online, Malibu), Dennis Nicholson (BUBL/CAIRNS) and David Kay (FD).

• See <URL: http://www.ukoln.ac.uk/metadata/cld/>• Collection Description eLib supporting study due out

in Oct. Will define core attributes (cf Dublin Core).

Page 26: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

26

TechnologiesNumber of formats and protocols could be used to implement distributed searching. XML and RDF plus:

• Z39.50ISO standard. Well-known in library world, but heavy-weight

• whois++Lightweight IETF standard. Used in ANR gateways, but not widely deployed

• LDAPLightweight version of X.500 directory service.

• HTTP/NG?Opportunity to develop new solution using OO technologies

IETF WebDav:• Requirement for distributed authoring include author metadata

and collection definitions. See <URL: http://www.ietf.org/html.charters/webdav-charter.html> and <URL: http://www.ietf.org/ids.by.wg/webdav.html>

Page 27: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

27

AuthenticationDeployment of an open, scaleable, flexible authentication system is difficult & expensiveCurrent solutions include:

• Server-based username and password schemes• IP-based schemes• Athens - Based on replicated Sybase application See

<URL: http://www.athens.ac.uk/>• W3C DSig work - Digital Signatures Initiative.

See <URL: http://www.w3.org/DSig/>• Other Public Key developments - e.g. reports of Post

Office involvement, statements from Tony Blair, EU, .."In May 1998 the Commission published its proposal for a "European Parliament and Council Directive on a Common Framework for Electronic Signatures" (COM(1998)297)."

Page 28: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

28

CertificatesShould we be looking into using commercially-supported digital ids, such as Verisign's?

• Can purchase server ID for $349

• End user certificates available

Use certificates to positively identify yourself, certificate authorities andpublishers

Browser Support

Need for a certification infrastructure

Page 29: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

29

Intermediaries can provide functionality not available at client:

• DOI support• XML support / format conversion• Authentication

Deployment IssuesMore sophisticated deployment techniques can be adopted to overcome deficiencies in simple model

HTML resource

browserWeb server

Web server simply sends file to clientFile contains redundant information (for old browsers) plus client interrogation support

HTML / XML /

databaseresource browser

Server proxy

Client proxy

Original Model

Sophisticated Model

IntelligentWeb server

Example of an intermediary

Page 30: 1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath  UKOLN.

30

ConclusionsTo conclude:

• Standards are important, especially for national initiatives, such as eLib

• Proprietary solutions are often tempting because:– They are available– They are often well-marketed and well-supported– They may become standardised– Solutions based on standards may not be properly supported

by applications

• Metadata is big growth area• Opportunity for eLib projects to shape developments• Intermediaries may have a role to play in deploying

standards-based solutions• Intelligent servers likely to be important