Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded...

45
Towards a Semantic Web for Heritage Resources Thematic Issue 3 May 2003

Transcript of Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded...

Page 1: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

TOWARDS A SEMANTIC WEB FOR

HERITAGE RESOURCES

Thematic Issue 3

4 DigiCULT

CONTENT

Guntram Geser

Introduction and Overview 5

Seamus Ross

Position PaperTowards a Semantic Web for Heritage Resources 7

Interview with Janneke van Kersen

Development of the Semantic Web Must Begin at the Grass Roots Level 12

Michael Steemson

DigiCULTrsquos Expert 13 Tangle with the Semantic Web 14

Semantic Web Terms and Reading ListA-X 21

Interview with Nicola Guarino

Semantic Web should be based on Well-founded Ontologies 25

Guntram Geser

A Cultural Heritage Semantic Web Example amp Primer 26

The Darmstadt Forum Participants 38

DigiCULT Project Information 42

Imprint 43

FUNCTION AND FOCUS

DigiCULT as a support measure within theInformation Society Technologies Programme (IST)will for a period of 30 months (beginning March2002) provide a technology watch mechanism forthe cultural and scientific heritage sector Backedby a network of peer experts the project monitorsdiscusses and analyses existing and emergingtechnologies likely to bring benefits to the sector

To promote the results and encourage early take-up of relevant technologies DigiCULT has put inplace a rigorous publication agenda of sevenThematic Issues three in-depth Technology WatchReports as well as the DigiCULTInfo e-journal

pushed to a growing database of interested personsand organisations on a regular basisAll DigiCULTproducts can be downloaded from the project Web-site httpwwwdigicultinfo as they become avail-ableThe opportunity to subscribe to the Digi-CULTInfo is also found here

March 2003 saw the release of the first DigiCULTTechnology Watch ReportThis report covers thetopics Customer Relationship Management DigitalAsset Management Systems Smart Labels and SmartTagsVirtual Reality and Display Technologies Hu-man Interfaces and Games TechnologiesAddressingprimarily technological issues it serves as a guide towhat a heritage institution needs to consider whenbuying into one of these technologies

In comparison with the Technology Watch Reportsthe Thematic Issues focus more on the organisationalpolicy and economic aspects of the technologiesunder considerationThey are based on the expertround tables organised by the DigiCULT Forumsecretariat In addition to the Forum discussion theyprovide opinions of other experts in the form ofarticles and interviews case studies short descriptionsof related projects together with a selection ofrelevant literature

TOPIC AND CHALLENGE

This third Thematic Issue addresses the questionsWhat is the Semantic Web What will it do forheritage institutions And what is the role of certainlanguages in particular XML and RDF

In short the Semantic Web vision proclaims a Webof machine-readable data which allows softwareagents to automatically carry out rather complextasks for humans Key to realising this vision issemantic interoperability of Web resourcesYet suchinteroperability is not the primary goal of heritageinstitutions (and intelligent software agents are notreadily at hand)

What the institutions are looking for are new waysof providing scholarly and non-expert users (egschool classes lifelong learners) with access to theircollections and related knowledgeThis goal can beaccomplished for example through online collections

DigiCULT 5

INTRODUCTION AND OVERVIEWBy Guntram Geser

Philosophy in Discussion With a Philosopher

6 DigiCULT

and exhibitions that not only display objects andsimple descriptions (drawn from metadata) but alsoallow for understanding relationships between objects(created by semantically interrelated metadata)TheSemantic Web community promises to assist inachieving this goal but the challenge for the herit-age institutions would be to first implement thenecessary data infrastructure

The challenge for the Semantic Web expertround table was or at least the DigiCULT Secretariatthought it was not to run into a debate betweenlsquotheoryrsquo and lsquopracticersquo In other words between whatacademic Semantic Web scholars and what practi-tioners from heritage institutions think needs to beaccomplished what is feasible and affordable andwhere to concentrate efforts For the discussionXML seemed to provide a good starting pointXML on the one hand is increasingly considered byheritage institutions as a key standard for publishingmetadata on the Web on the other hand it is a majorbuilding block for the Semantic Web It proveddifferent in a positive sense In the discussion wideuse of XML was taken for granted while the keyarea of interest that surfaced and was seen to bemost fruitful to explore was ontologies

OVERVIEW

Setting the context for this Issue the positionpaper looks into the requirements for achieving thegoals of the Semantic Web and assesses whether theavailable technologies will be able to deliver on whatthe advocates of the Semantic Web envisage as wellas whether the cultural heritage sector is in a positionto take substantial steps towards semantic interopera-bility It concludes with the argument that the sectoris more likely to be left behind due in particular tothe fact that for the institutions the rewards for thenecessary investments are still too nebulous

Janneke van Kersen from the Dutch DigitalHeritage Association in her interview with theDigiCULT Journalist suggests that despite thecloudy Semantic Web horizon there are medium-term benefits to be gained for heritage institutions

in taking steps towards the visionAnd she states thatit is up to associations like hers together with largerinstitutions to take the lead in this prove that pro-posed solutions work and support smaller institutionsin taking advantage of them On the other handNicola Guarino in his interview believes thatreaching the lsquorealrsquo Semantic Web lies in takinglsquothe fundamental routersquo of implementing genericontologies based on linguistics and logics withinthe Semantic Web fabric He also claims that evenincremental progress along this path can haveremarkable pay-offs

Michael Steemsonrsquos summary of the Darm-stadt Forum illustrates that the Semantic Web topicresembles a labyrinth with currently no definitemap or Ariadnersquos Thread at hand Building on themany technologies the Forum participants mentionedas some of the labyrinthrsquos angles we have added tothe summary a list of resources related to thesetechnologies

In an effort to raise the veil of mystery surround-ing the Semantic Web this issue includes an examplefrom the sector on the implementation of semanticinteroperability of metadata combined with a primerthat explains core building blocks such as XMLRDF and ontologiesWhile a detailed primer of forexample RDF would alone exhaust the limits of thisissue1 the goal here is to deliver an lsquoall-inclusiversquoprimer within the space permitted with all theinevitable limitations this entailsThe primer attemptsto provide a general understanding of the SemanticWeb architecture without obliging the reader towander through the long and perplexing corridors of language specifications

Finally we want to thank the KoninklijkeBibliotheek National Library of the Netherlandsfor their kind permission to use selected images fromtheir collection of illuminated medieval manuscripts2

We hope you will appreciate the little narratives theyrepresent within the overall fabric of this DigiCULTThematic Issue

1cf F Manola E Miller

RDF Primer (W3C Working

Draft 23 January 2003)

httpwwww3org

TRrdf-primer2See their online collection

of such images at

httpwwwkbnlkb

manuscripts which offers

advanced search and

presentation features

Tim Berners-Lee and his colleagues at W3Chave recognised that the real benefits of theweb-based information revolution will come

from enabling the interoperability of contentThecurrent generation of web delivery is they haveargued designed for human users who struggle tomake effective use of the billions of pages of infor-mation currently accessibleWhen we search forsomething at the moment we sometimes discoversuitable candidate information but more often thannot this is far from being the case More than thisthe entire process of searching discovery and use isdesigned to be driven by humansWhen we discoverone piece of the puzzle we need manually to positionthat information so that it can help us to search outthe next piece of the puzzleWe find that Darmstadtis near FrankfurtThen we find that there are flightsfrom Glasgow to Frankfurt and there is a bus fromFrankfurt Airport to DarmstadtThen I search fortimetables make manual comparisons and decidewhich times best meet my requirements In theShangri-La that is the Semantic Web my lsquoagentrsquo wouldrecognise from its regular review of my diary that I needed to be at a meeting in Darmstadt on the 21stof January 2003 and it would search out the options

DigiCULT 7

POSITION PAPER

By Seamus Ross

Genesis ndash The Creation Division of Light and Darkness

analyse the timetables identify the optimum travelarrangements book my non-smoking hotel accom-modation and order the taxi to take me to theairport (It might even check the weather forecastsand warn me to bring particular types of clothing)Certainly to make this happen there has to be afundamental shift in the way data information andknowledge are represented on the web

The proliferation of web-based resources makesfinding what you are looking for increasingly difficultAccording to Internet user studies in 1996 50 ofInternet users reported spending time looking forinformation without finding it but by 2002 onlyabout 40 of users ended their lsquosearching sessionsrsquounsuccessfullyAt first glance we might conclude thatweb discovery tools have improved andor theinformation searching skills of users have improvedOver the past seven years the quantity of content hasmushroomed the search tools have become moreefficient developers approach the use of meta-tagsmore effectively and anecdotal evidence suggests thatthe searching techniques of users have become moresophisticatedWe should continue to be surprised bythe high failure rate and wonder why it remainsproportionally so high as the numbers of users havegrown to nearly 600 million In reality there is justtoo much content available It is poorly described Itis not interconnected Search engines themselves areblunt instruments Most users of the web do not havevery mature searching strategies and rarely use eventhe blunt instruments as effectively as they mightAsolution is to make more of the information capableof discovery interpretation and reuse by automatedinformation processing tools themselves However thecurrent ways content is represented on the web makesit nearly impossible for machines to search the webmeaningfully and effectively ndash even with the limi-tations of their skills and tools humans are better atsearching the web than the most powerful of thecurrent generation of agentsThe emergence of theSemantic Web would solve this problem

The web has made us realise the tremendouspotential of digital resources and made them widelyavailable Content as presented on the web currently is

8 DigiCULT

Genesis ndash The CreationDivision of the Waters Above and Below the Firmament

mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web

The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo

TOWARDS AN INTEROPERABLE

SEMANTIC WEB FOR HERITAGE

RESOURCES

Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and

semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as

RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)

The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value

1 See the lsquoSemantic Web

Terms and Reading

Listrsquo in this Issue

DigiCULT 9

ONTOLOGIES ndash THE JEWELS

OF THE SEMANTIC WEB

For the Semantic Web to succeed it will requirenot only modelling languages such as XML

RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved

The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-

logies for the heritage sector| How can we prioritise the ontologies that are

needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)

| What heritage-based organisations should focuson ontology creation

| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively

| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector

Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found

that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail

Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence

LEGITIMISING THE SEMANTIC WEB

INVESTMENT

Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen

as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector

Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow

At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained

10 DigiCULT

Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants

accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not

Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating

the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more

than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits

The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses

DigiCULT 11

CONCLUSION

Over the next five years the possibilities offeredby the Semantic Web will bring little near term

benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge

Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers

Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131

Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001

Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf

DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673

Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60

Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453

Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59

Hendler J 2001Agents and the Semantic Web in IEEE

Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating

Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680

Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28

McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80

Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf

Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453

RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication

Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468

Bibliography

12 DigiCULT

DEVELOPMENT OF THE SEMANTIC WEB

MUST BEGIN AT THE GRASS ROOTS LEVEL

AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS

By Joost van Kasteren

T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down

approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo

Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future

The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)

The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through

the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid

Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo

According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their

collections anew in a way that fits the ontologyThatis just too much workrsquo

Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge

There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo

Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description

MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives

The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)

Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo

Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl

DigiCULT 13

14 DigiCULT

I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific

writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now

lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another

The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo

It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian

DIGICULTrsquoS EXPERT 13 TANGLE

WITH THE SEMANTIC WEB

By Michael Steemson

Genesis ndash The Creation Birds and Fishes

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 2: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

TOWARDS A SEMANTIC WEB FOR

HERITAGE RESOURCES

Thematic Issue 3

4 DigiCULT

CONTENT

Guntram Geser

Introduction and Overview 5

Seamus Ross

Position PaperTowards a Semantic Web for Heritage Resources 7

Interview with Janneke van Kersen

Development of the Semantic Web Must Begin at the Grass Roots Level 12

Michael Steemson

DigiCULTrsquos Expert 13 Tangle with the Semantic Web 14

Semantic Web Terms and Reading ListA-X 21

Interview with Nicola Guarino

Semantic Web should be based on Well-founded Ontologies 25

Guntram Geser

A Cultural Heritage Semantic Web Example amp Primer 26

The Darmstadt Forum Participants 38

DigiCULT Project Information 42

Imprint 43

FUNCTION AND FOCUS

DigiCULT as a support measure within theInformation Society Technologies Programme (IST)will for a period of 30 months (beginning March2002) provide a technology watch mechanism forthe cultural and scientific heritage sector Backedby a network of peer experts the project monitorsdiscusses and analyses existing and emergingtechnologies likely to bring benefits to the sector

To promote the results and encourage early take-up of relevant technologies DigiCULT has put inplace a rigorous publication agenda of sevenThematic Issues three in-depth Technology WatchReports as well as the DigiCULTInfo e-journal

pushed to a growing database of interested personsand organisations on a regular basisAll DigiCULTproducts can be downloaded from the project Web-site httpwwwdigicultinfo as they become avail-ableThe opportunity to subscribe to the Digi-CULTInfo is also found here

March 2003 saw the release of the first DigiCULTTechnology Watch ReportThis report covers thetopics Customer Relationship Management DigitalAsset Management Systems Smart Labels and SmartTagsVirtual Reality and Display Technologies Hu-man Interfaces and Games TechnologiesAddressingprimarily technological issues it serves as a guide towhat a heritage institution needs to consider whenbuying into one of these technologies

In comparison with the Technology Watch Reportsthe Thematic Issues focus more on the organisationalpolicy and economic aspects of the technologiesunder considerationThey are based on the expertround tables organised by the DigiCULT Forumsecretariat In addition to the Forum discussion theyprovide opinions of other experts in the form ofarticles and interviews case studies short descriptionsof related projects together with a selection ofrelevant literature

TOPIC AND CHALLENGE

This third Thematic Issue addresses the questionsWhat is the Semantic Web What will it do forheritage institutions And what is the role of certainlanguages in particular XML and RDF

In short the Semantic Web vision proclaims a Webof machine-readable data which allows softwareagents to automatically carry out rather complextasks for humans Key to realising this vision issemantic interoperability of Web resourcesYet suchinteroperability is not the primary goal of heritageinstitutions (and intelligent software agents are notreadily at hand)

What the institutions are looking for are new waysof providing scholarly and non-expert users (egschool classes lifelong learners) with access to theircollections and related knowledgeThis goal can beaccomplished for example through online collections

DigiCULT 5

INTRODUCTION AND OVERVIEWBy Guntram Geser

Philosophy in Discussion With a Philosopher

6 DigiCULT

and exhibitions that not only display objects andsimple descriptions (drawn from metadata) but alsoallow for understanding relationships between objects(created by semantically interrelated metadata)TheSemantic Web community promises to assist inachieving this goal but the challenge for the herit-age institutions would be to first implement thenecessary data infrastructure

The challenge for the Semantic Web expertround table was or at least the DigiCULT Secretariatthought it was not to run into a debate betweenlsquotheoryrsquo and lsquopracticersquo In other words between whatacademic Semantic Web scholars and what practi-tioners from heritage institutions think needs to beaccomplished what is feasible and affordable andwhere to concentrate efforts For the discussionXML seemed to provide a good starting pointXML on the one hand is increasingly considered byheritage institutions as a key standard for publishingmetadata on the Web on the other hand it is a majorbuilding block for the Semantic Web It proveddifferent in a positive sense In the discussion wideuse of XML was taken for granted while the keyarea of interest that surfaced and was seen to bemost fruitful to explore was ontologies

OVERVIEW

Setting the context for this Issue the positionpaper looks into the requirements for achieving thegoals of the Semantic Web and assesses whether theavailable technologies will be able to deliver on whatthe advocates of the Semantic Web envisage as wellas whether the cultural heritage sector is in a positionto take substantial steps towards semantic interopera-bility It concludes with the argument that the sectoris more likely to be left behind due in particular tothe fact that for the institutions the rewards for thenecessary investments are still too nebulous

Janneke van Kersen from the Dutch DigitalHeritage Association in her interview with theDigiCULT Journalist suggests that despite thecloudy Semantic Web horizon there are medium-term benefits to be gained for heritage institutions

in taking steps towards the visionAnd she states thatit is up to associations like hers together with largerinstitutions to take the lead in this prove that pro-posed solutions work and support smaller institutionsin taking advantage of them On the other handNicola Guarino in his interview believes thatreaching the lsquorealrsquo Semantic Web lies in takinglsquothe fundamental routersquo of implementing genericontologies based on linguistics and logics withinthe Semantic Web fabric He also claims that evenincremental progress along this path can haveremarkable pay-offs

Michael Steemsonrsquos summary of the Darm-stadt Forum illustrates that the Semantic Web topicresembles a labyrinth with currently no definitemap or Ariadnersquos Thread at hand Building on themany technologies the Forum participants mentionedas some of the labyrinthrsquos angles we have added tothe summary a list of resources related to thesetechnologies

In an effort to raise the veil of mystery surround-ing the Semantic Web this issue includes an examplefrom the sector on the implementation of semanticinteroperability of metadata combined with a primerthat explains core building blocks such as XMLRDF and ontologiesWhile a detailed primer of forexample RDF would alone exhaust the limits of thisissue1 the goal here is to deliver an lsquoall-inclusiversquoprimer within the space permitted with all theinevitable limitations this entailsThe primer attemptsto provide a general understanding of the SemanticWeb architecture without obliging the reader towander through the long and perplexing corridors of language specifications

Finally we want to thank the KoninklijkeBibliotheek National Library of the Netherlandsfor their kind permission to use selected images fromtheir collection of illuminated medieval manuscripts2

We hope you will appreciate the little narratives theyrepresent within the overall fabric of this DigiCULTThematic Issue

1cf F Manola E Miller

RDF Primer (W3C Working

Draft 23 January 2003)

httpwwww3org

TRrdf-primer2See their online collection

of such images at

httpwwwkbnlkb

manuscripts which offers

advanced search and

presentation features

Tim Berners-Lee and his colleagues at W3Chave recognised that the real benefits of theweb-based information revolution will come

from enabling the interoperability of contentThecurrent generation of web delivery is they haveargued designed for human users who struggle tomake effective use of the billions of pages of infor-mation currently accessibleWhen we search forsomething at the moment we sometimes discoversuitable candidate information but more often thannot this is far from being the case More than thisthe entire process of searching discovery and use isdesigned to be driven by humansWhen we discoverone piece of the puzzle we need manually to positionthat information so that it can help us to search outthe next piece of the puzzleWe find that Darmstadtis near FrankfurtThen we find that there are flightsfrom Glasgow to Frankfurt and there is a bus fromFrankfurt Airport to DarmstadtThen I search fortimetables make manual comparisons and decidewhich times best meet my requirements In theShangri-La that is the Semantic Web my lsquoagentrsquo wouldrecognise from its regular review of my diary that I needed to be at a meeting in Darmstadt on the 21stof January 2003 and it would search out the options

DigiCULT 7

POSITION PAPER

By Seamus Ross

Genesis ndash The Creation Division of Light and Darkness

analyse the timetables identify the optimum travelarrangements book my non-smoking hotel accom-modation and order the taxi to take me to theairport (It might even check the weather forecastsand warn me to bring particular types of clothing)Certainly to make this happen there has to be afundamental shift in the way data information andknowledge are represented on the web

The proliferation of web-based resources makesfinding what you are looking for increasingly difficultAccording to Internet user studies in 1996 50 ofInternet users reported spending time looking forinformation without finding it but by 2002 onlyabout 40 of users ended their lsquosearching sessionsrsquounsuccessfullyAt first glance we might conclude thatweb discovery tools have improved andor theinformation searching skills of users have improvedOver the past seven years the quantity of content hasmushroomed the search tools have become moreefficient developers approach the use of meta-tagsmore effectively and anecdotal evidence suggests thatthe searching techniques of users have become moresophisticatedWe should continue to be surprised bythe high failure rate and wonder why it remainsproportionally so high as the numbers of users havegrown to nearly 600 million In reality there is justtoo much content available It is poorly described Itis not interconnected Search engines themselves areblunt instruments Most users of the web do not havevery mature searching strategies and rarely use eventhe blunt instruments as effectively as they mightAsolution is to make more of the information capableof discovery interpretation and reuse by automatedinformation processing tools themselves However thecurrent ways content is represented on the web makesit nearly impossible for machines to search the webmeaningfully and effectively ndash even with the limi-tations of their skills and tools humans are better atsearching the web than the most powerful of thecurrent generation of agentsThe emergence of theSemantic Web would solve this problem

The web has made us realise the tremendouspotential of digital resources and made them widelyavailable Content as presented on the web currently is

8 DigiCULT

Genesis ndash The CreationDivision of the Waters Above and Below the Firmament

mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web

The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo

TOWARDS AN INTEROPERABLE

SEMANTIC WEB FOR HERITAGE

RESOURCES

Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and

semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as

RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)

The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value

1 See the lsquoSemantic Web

Terms and Reading

Listrsquo in this Issue

DigiCULT 9

ONTOLOGIES ndash THE JEWELS

OF THE SEMANTIC WEB

For the Semantic Web to succeed it will requirenot only modelling languages such as XML

RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved

The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-

logies for the heritage sector| How can we prioritise the ontologies that are

needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)

| What heritage-based organisations should focuson ontology creation

| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively

| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector

Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found

that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail

Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence

LEGITIMISING THE SEMANTIC WEB

INVESTMENT

Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen

as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector

Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow

At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained

10 DigiCULT

Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants

accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not

Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating

the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more

than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits

The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses

DigiCULT 11

CONCLUSION

Over the next five years the possibilities offeredby the Semantic Web will bring little near term

benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge

Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers

Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131

Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001

Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf

DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673

Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60

Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453

Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59

Hendler J 2001Agents and the Semantic Web in IEEE

Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating

Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680

Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28

McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80

Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf

Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453

RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication

Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468

Bibliography

12 DigiCULT

DEVELOPMENT OF THE SEMANTIC WEB

MUST BEGIN AT THE GRASS ROOTS LEVEL

AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS

By Joost van Kasteren

T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down

approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo

Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future

The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)

The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through

the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid

Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo

According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their

collections anew in a way that fits the ontologyThatis just too much workrsquo

Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge

There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo

Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description

MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives

The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)

Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo

Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl

DigiCULT 13

14 DigiCULT

I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific

writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now

lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another

The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo

It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian

DIGICULTrsquoS EXPERT 13 TANGLE

WITH THE SEMANTIC WEB

By Michael Steemson

Genesis ndash The Creation Birds and Fishes

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 3: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

4 DigiCULT

CONTENT

Guntram Geser

Introduction and Overview 5

Seamus Ross

Position PaperTowards a Semantic Web for Heritage Resources 7

Interview with Janneke van Kersen

Development of the Semantic Web Must Begin at the Grass Roots Level 12

Michael Steemson

DigiCULTrsquos Expert 13 Tangle with the Semantic Web 14

Semantic Web Terms and Reading ListA-X 21

Interview with Nicola Guarino

Semantic Web should be based on Well-founded Ontologies 25

Guntram Geser

A Cultural Heritage Semantic Web Example amp Primer 26

The Darmstadt Forum Participants 38

DigiCULT Project Information 42

Imprint 43

FUNCTION AND FOCUS

DigiCULT as a support measure within theInformation Society Technologies Programme (IST)will for a period of 30 months (beginning March2002) provide a technology watch mechanism forthe cultural and scientific heritage sector Backedby a network of peer experts the project monitorsdiscusses and analyses existing and emergingtechnologies likely to bring benefits to the sector

To promote the results and encourage early take-up of relevant technologies DigiCULT has put inplace a rigorous publication agenda of sevenThematic Issues three in-depth Technology WatchReports as well as the DigiCULTInfo e-journal

pushed to a growing database of interested personsand organisations on a regular basisAll DigiCULTproducts can be downloaded from the project Web-site httpwwwdigicultinfo as they become avail-ableThe opportunity to subscribe to the Digi-CULTInfo is also found here

March 2003 saw the release of the first DigiCULTTechnology Watch ReportThis report covers thetopics Customer Relationship Management DigitalAsset Management Systems Smart Labels and SmartTagsVirtual Reality and Display Technologies Hu-man Interfaces and Games TechnologiesAddressingprimarily technological issues it serves as a guide towhat a heritage institution needs to consider whenbuying into one of these technologies

In comparison with the Technology Watch Reportsthe Thematic Issues focus more on the organisationalpolicy and economic aspects of the technologiesunder considerationThey are based on the expertround tables organised by the DigiCULT Forumsecretariat In addition to the Forum discussion theyprovide opinions of other experts in the form ofarticles and interviews case studies short descriptionsof related projects together with a selection ofrelevant literature

TOPIC AND CHALLENGE

This third Thematic Issue addresses the questionsWhat is the Semantic Web What will it do forheritage institutions And what is the role of certainlanguages in particular XML and RDF

In short the Semantic Web vision proclaims a Webof machine-readable data which allows softwareagents to automatically carry out rather complextasks for humans Key to realising this vision issemantic interoperability of Web resourcesYet suchinteroperability is not the primary goal of heritageinstitutions (and intelligent software agents are notreadily at hand)

What the institutions are looking for are new waysof providing scholarly and non-expert users (egschool classes lifelong learners) with access to theircollections and related knowledgeThis goal can beaccomplished for example through online collections

DigiCULT 5

INTRODUCTION AND OVERVIEWBy Guntram Geser

Philosophy in Discussion With a Philosopher

6 DigiCULT

and exhibitions that not only display objects andsimple descriptions (drawn from metadata) but alsoallow for understanding relationships between objects(created by semantically interrelated metadata)TheSemantic Web community promises to assist inachieving this goal but the challenge for the herit-age institutions would be to first implement thenecessary data infrastructure

The challenge for the Semantic Web expertround table was or at least the DigiCULT Secretariatthought it was not to run into a debate betweenlsquotheoryrsquo and lsquopracticersquo In other words between whatacademic Semantic Web scholars and what practi-tioners from heritage institutions think needs to beaccomplished what is feasible and affordable andwhere to concentrate efforts For the discussionXML seemed to provide a good starting pointXML on the one hand is increasingly considered byheritage institutions as a key standard for publishingmetadata on the Web on the other hand it is a majorbuilding block for the Semantic Web It proveddifferent in a positive sense In the discussion wideuse of XML was taken for granted while the keyarea of interest that surfaced and was seen to bemost fruitful to explore was ontologies

OVERVIEW

Setting the context for this Issue the positionpaper looks into the requirements for achieving thegoals of the Semantic Web and assesses whether theavailable technologies will be able to deliver on whatthe advocates of the Semantic Web envisage as wellas whether the cultural heritage sector is in a positionto take substantial steps towards semantic interopera-bility It concludes with the argument that the sectoris more likely to be left behind due in particular tothe fact that for the institutions the rewards for thenecessary investments are still too nebulous

Janneke van Kersen from the Dutch DigitalHeritage Association in her interview with theDigiCULT Journalist suggests that despite thecloudy Semantic Web horizon there are medium-term benefits to be gained for heritage institutions

in taking steps towards the visionAnd she states thatit is up to associations like hers together with largerinstitutions to take the lead in this prove that pro-posed solutions work and support smaller institutionsin taking advantage of them On the other handNicola Guarino in his interview believes thatreaching the lsquorealrsquo Semantic Web lies in takinglsquothe fundamental routersquo of implementing genericontologies based on linguistics and logics withinthe Semantic Web fabric He also claims that evenincremental progress along this path can haveremarkable pay-offs

Michael Steemsonrsquos summary of the Darm-stadt Forum illustrates that the Semantic Web topicresembles a labyrinth with currently no definitemap or Ariadnersquos Thread at hand Building on themany technologies the Forum participants mentionedas some of the labyrinthrsquos angles we have added tothe summary a list of resources related to thesetechnologies

In an effort to raise the veil of mystery surround-ing the Semantic Web this issue includes an examplefrom the sector on the implementation of semanticinteroperability of metadata combined with a primerthat explains core building blocks such as XMLRDF and ontologiesWhile a detailed primer of forexample RDF would alone exhaust the limits of thisissue1 the goal here is to deliver an lsquoall-inclusiversquoprimer within the space permitted with all theinevitable limitations this entailsThe primer attemptsto provide a general understanding of the SemanticWeb architecture without obliging the reader towander through the long and perplexing corridors of language specifications

Finally we want to thank the KoninklijkeBibliotheek National Library of the Netherlandsfor their kind permission to use selected images fromtheir collection of illuminated medieval manuscripts2

We hope you will appreciate the little narratives theyrepresent within the overall fabric of this DigiCULTThematic Issue

1cf F Manola E Miller

RDF Primer (W3C Working

Draft 23 January 2003)

httpwwww3org

TRrdf-primer2See their online collection

of such images at

httpwwwkbnlkb

manuscripts which offers

advanced search and

presentation features

Tim Berners-Lee and his colleagues at W3Chave recognised that the real benefits of theweb-based information revolution will come

from enabling the interoperability of contentThecurrent generation of web delivery is they haveargued designed for human users who struggle tomake effective use of the billions of pages of infor-mation currently accessibleWhen we search forsomething at the moment we sometimes discoversuitable candidate information but more often thannot this is far from being the case More than thisthe entire process of searching discovery and use isdesigned to be driven by humansWhen we discoverone piece of the puzzle we need manually to positionthat information so that it can help us to search outthe next piece of the puzzleWe find that Darmstadtis near FrankfurtThen we find that there are flightsfrom Glasgow to Frankfurt and there is a bus fromFrankfurt Airport to DarmstadtThen I search fortimetables make manual comparisons and decidewhich times best meet my requirements In theShangri-La that is the Semantic Web my lsquoagentrsquo wouldrecognise from its regular review of my diary that I needed to be at a meeting in Darmstadt on the 21stof January 2003 and it would search out the options

DigiCULT 7

POSITION PAPER

By Seamus Ross

Genesis ndash The Creation Division of Light and Darkness

analyse the timetables identify the optimum travelarrangements book my non-smoking hotel accom-modation and order the taxi to take me to theairport (It might even check the weather forecastsand warn me to bring particular types of clothing)Certainly to make this happen there has to be afundamental shift in the way data information andknowledge are represented on the web

The proliferation of web-based resources makesfinding what you are looking for increasingly difficultAccording to Internet user studies in 1996 50 ofInternet users reported spending time looking forinformation without finding it but by 2002 onlyabout 40 of users ended their lsquosearching sessionsrsquounsuccessfullyAt first glance we might conclude thatweb discovery tools have improved andor theinformation searching skills of users have improvedOver the past seven years the quantity of content hasmushroomed the search tools have become moreefficient developers approach the use of meta-tagsmore effectively and anecdotal evidence suggests thatthe searching techniques of users have become moresophisticatedWe should continue to be surprised bythe high failure rate and wonder why it remainsproportionally so high as the numbers of users havegrown to nearly 600 million In reality there is justtoo much content available It is poorly described Itis not interconnected Search engines themselves areblunt instruments Most users of the web do not havevery mature searching strategies and rarely use eventhe blunt instruments as effectively as they mightAsolution is to make more of the information capableof discovery interpretation and reuse by automatedinformation processing tools themselves However thecurrent ways content is represented on the web makesit nearly impossible for machines to search the webmeaningfully and effectively ndash even with the limi-tations of their skills and tools humans are better atsearching the web than the most powerful of thecurrent generation of agentsThe emergence of theSemantic Web would solve this problem

The web has made us realise the tremendouspotential of digital resources and made them widelyavailable Content as presented on the web currently is

8 DigiCULT

Genesis ndash The CreationDivision of the Waters Above and Below the Firmament

mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web

The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo

TOWARDS AN INTEROPERABLE

SEMANTIC WEB FOR HERITAGE

RESOURCES

Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and

semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as

RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)

The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value

1 See the lsquoSemantic Web

Terms and Reading

Listrsquo in this Issue

DigiCULT 9

ONTOLOGIES ndash THE JEWELS

OF THE SEMANTIC WEB

For the Semantic Web to succeed it will requirenot only modelling languages such as XML

RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved

The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-

logies for the heritage sector| How can we prioritise the ontologies that are

needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)

| What heritage-based organisations should focuson ontology creation

| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively

| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector

Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found

that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail

Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence

LEGITIMISING THE SEMANTIC WEB

INVESTMENT

Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen

as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector

Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow

At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained

10 DigiCULT

Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants

accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not

Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating

the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more

than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits

The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses

DigiCULT 11

CONCLUSION

Over the next five years the possibilities offeredby the Semantic Web will bring little near term

benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge

Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers

Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131

Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001

Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf

DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673

Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60

Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453

Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59

Hendler J 2001Agents and the Semantic Web in IEEE

Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating

Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680

Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28

McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80

Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf

Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453

RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication

Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468

Bibliography

12 DigiCULT

DEVELOPMENT OF THE SEMANTIC WEB

MUST BEGIN AT THE GRASS ROOTS LEVEL

AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS

By Joost van Kasteren

T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down

approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo

Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future

The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)

The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through

the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid

Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo

According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their

collections anew in a way that fits the ontologyThatis just too much workrsquo

Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge

There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo

Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description

MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives

The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)

Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo

Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl

DigiCULT 13

14 DigiCULT

I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific

writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now

lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another

The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo

It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian

DIGICULTrsquoS EXPERT 13 TANGLE

WITH THE SEMANTIC WEB

By Michael Steemson

Genesis ndash The Creation Birds and Fishes

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 4: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

FUNCTION AND FOCUS

DigiCULT as a support measure within theInformation Society Technologies Programme (IST)will for a period of 30 months (beginning March2002) provide a technology watch mechanism forthe cultural and scientific heritage sector Backedby a network of peer experts the project monitorsdiscusses and analyses existing and emergingtechnologies likely to bring benefits to the sector

To promote the results and encourage early take-up of relevant technologies DigiCULT has put inplace a rigorous publication agenda of sevenThematic Issues three in-depth Technology WatchReports as well as the DigiCULTInfo e-journal

pushed to a growing database of interested personsand organisations on a regular basisAll DigiCULTproducts can be downloaded from the project Web-site httpwwwdigicultinfo as they become avail-ableThe opportunity to subscribe to the Digi-CULTInfo is also found here

March 2003 saw the release of the first DigiCULTTechnology Watch ReportThis report covers thetopics Customer Relationship Management DigitalAsset Management Systems Smart Labels and SmartTagsVirtual Reality and Display Technologies Hu-man Interfaces and Games TechnologiesAddressingprimarily technological issues it serves as a guide towhat a heritage institution needs to consider whenbuying into one of these technologies

In comparison with the Technology Watch Reportsthe Thematic Issues focus more on the organisationalpolicy and economic aspects of the technologiesunder considerationThey are based on the expertround tables organised by the DigiCULT Forumsecretariat In addition to the Forum discussion theyprovide opinions of other experts in the form ofarticles and interviews case studies short descriptionsof related projects together with a selection ofrelevant literature

TOPIC AND CHALLENGE

This third Thematic Issue addresses the questionsWhat is the Semantic Web What will it do forheritage institutions And what is the role of certainlanguages in particular XML and RDF

In short the Semantic Web vision proclaims a Webof machine-readable data which allows softwareagents to automatically carry out rather complextasks for humans Key to realising this vision issemantic interoperability of Web resourcesYet suchinteroperability is not the primary goal of heritageinstitutions (and intelligent software agents are notreadily at hand)

What the institutions are looking for are new waysof providing scholarly and non-expert users (egschool classes lifelong learners) with access to theircollections and related knowledgeThis goal can beaccomplished for example through online collections

DigiCULT 5

INTRODUCTION AND OVERVIEWBy Guntram Geser

Philosophy in Discussion With a Philosopher

6 DigiCULT

and exhibitions that not only display objects andsimple descriptions (drawn from metadata) but alsoallow for understanding relationships between objects(created by semantically interrelated metadata)TheSemantic Web community promises to assist inachieving this goal but the challenge for the herit-age institutions would be to first implement thenecessary data infrastructure

The challenge for the Semantic Web expertround table was or at least the DigiCULT Secretariatthought it was not to run into a debate betweenlsquotheoryrsquo and lsquopracticersquo In other words between whatacademic Semantic Web scholars and what practi-tioners from heritage institutions think needs to beaccomplished what is feasible and affordable andwhere to concentrate efforts For the discussionXML seemed to provide a good starting pointXML on the one hand is increasingly considered byheritage institutions as a key standard for publishingmetadata on the Web on the other hand it is a majorbuilding block for the Semantic Web It proveddifferent in a positive sense In the discussion wideuse of XML was taken for granted while the keyarea of interest that surfaced and was seen to bemost fruitful to explore was ontologies

OVERVIEW

Setting the context for this Issue the positionpaper looks into the requirements for achieving thegoals of the Semantic Web and assesses whether theavailable technologies will be able to deliver on whatthe advocates of the Semantic Web envisage as wellas whether the cultural heritage sector is in a positionto take substantial steps towards semantic interopera-bility It concludes with the argument that the sectoris more likely to be left behind due in particular tothe fact that for the institutions the rewards for thenecessary investments are still too nebulous

Janneke van Kersen from the Dutch DigitalHeritage Association in her interview with theDigiCULT Journalist suggests that despite thecloudy Semantic Web horizon there are medium-term benefits to be gained for heritage institutions

in taking steps towards the visionAnd she states thatit is up to associations like hers together with largerinstitutions to take the lead in this prove that pro-posed solutions work and support smaller institutionsin taking advantage of them On the other handNicola Guarino in his interview believes thatreaching the lsquorealrsquo Semantic Web lies in takinglsquothe fundamental routersquo of implementing genericontologies based on linguistics and logics withinthe Semantic Web fabric He also claims that evenincremental progress along this path can haveremarkable pay-offs

Michael Steemsonrsquos summary of the Darm-stadt Forum illustrates that the Semantic Web topicresembles a labyrinth with currently no definitemap or Ariadnersquos Thread at hand Building on themany technologies the Forum participants mentionedas some of the labyrinthrsquos angles we have added tothe summary a list of resources related to thesetechnologies

In an effort to raise the veil of mystery surround-ing the Semantic Web this issue includes an examplefrom the sector on the implementation of semanticinteroperability of metadata combined with a primerthat explains core building blocks such as XMLRDF and ontologiesWhile a detailed primer of forexample RDF would alone exhaust the limits of thisissue1 the goal here is to deliver an lsquoall-inclusiversquoprimer within the space permitted with all theinevitable limitations this entailsThe primer attemptsto provide a general understanding of the SemanticWeb architecture without obliging the reader towander through the long and perplexing corridors of language specifications

Finally we want to thank the KoninklijkeBibliotheek National Library of the Netherlandsfor their kind permission to use selected images fromtheir collection of illuminated medieval manuscripts2

We hope you will appreciate the little narratives theyrepresent within the overall fabric of this DigiCULTThematic Issue

1cf F Manola E Miller

RDF Primer (W3C Working

Draft 23 January 2003)

httpwwww3org

TRrdf-primer2See their online collection

of such images at

httpwwwkbnlkb

manuscripts which offers

advanced search and

presentation features

Tim Berners-Lee and his colleagues at W3Chave recognised that the real benefits of theweb-based information revolution will come

from enabling the interoperability of contentThecurrent generation of web delivery is they haveargued designed for human users who struggle tomake effective use of the billions of pages of infor-mation currently accessibleWhen we search forsomething at the moment we sometimes discoversuitable candidate information but more often thannot this is far from being the case More than thisthe entire process of searching discovery and use isdesigned to be driven by humansWhen we discoverone piece of the puzzle we need manually to positionthat information so that it can help us to search outthe next piece of the puzzleWe find that Darmstadtis near FrankfurtThen we find that there are flightsfrom Glasgow to Frankfurt and there is a bus fromFrankfurt Airport to DarmstadtThen I search fortimetables make manual comparisons and decidewhich times best meet my requirements In theShangri-La that is the Semantic Web my lsquoagentrsquo wouldrecognise from its regular review of my diary that I needed to be at a meeting in Darmstadt on the 21stof January 2003 and it would search out the options

DigiCULT 7

POSITION PAPER

By Seamus Ross

Genesis ndash The Creation Division of Light and Darkness

analyse the timetables identify the optimum travelarrangements book my non-smoking hotel accom-modation and order the taxi to take me to theairport (It might even check the weather forecastsand warn me to bring particular types of clothing)Certainly to make this happen there has to be afundamental shift in the way data information andknowledge are represented on the web

The proliferation of web-based resources makesfinding what you are looking for increasingly difficultAccording to Internet user studies in 1996 50 ofInternet users reported spending time looking forinformation without finding it but by 2002 onlyabout 40 of users ended their lsquosearching sessionsrsquounsuccessfullyAt first glance we might conclude thatweb discovery tools have improved andor theinformation searching skills of users have improvedOver the past seven years the quantity of content hasmushroomed the search tools have become moreefficient developers approach the use of meta-tagsmore effectively and anecdotal evidence suggests thatthe searching techniques of users have become moresophisticatedWe should continue to be surprised bythe high failure rate and wonder why it remainsproportionally so high as the numbers of users havegrown to nearly 600 million In reality there is justtoo much content available It is poorly described Itis not interconnected Search engines themselves areblunt instruments Most users of the web do not havevery mature searching strategies and rarely use eventhe blunt instruments as effectively as they mightAsolution is to make more of the information capableof discovery interpretation and reuse by automatedinformation processing tools themselves However thecurrent ways content is represented on the web makesit nearly impossible for machines to search the webmeaningfully and effectively ndash even with the limi-tations of their skills and tools humans are better atsearching the web than the most powerful of thecurrent generation of agentsThe emergence of theSemantic Web would solve this problem

The web has made us realise the tremendouspotential of digital resources and made them widelyavailable Content as presented on the web currently is

8 DigiCULT

Genesis ndash The CreationDivision of the Waters Above and Below the Firmament

mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web

The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo

TOWARDS AN INTEROPERABLE

SEMANTIC WEB FOR HERITAGE

RESOURCES

Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and

semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as

RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)

The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value

1 See the lsquoSemantic Web

Terms and Reading

Listrsquo in this Issue

DigiCULT 9

ONTOLOGIES ndash THE JEWELS

OF THE SEMANTIC WEB

For the Semantic Web to succeed it will requirenot only modelling languages such as XML

RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved

The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-

logies for the heritage sector| How can we prioritise the ontologies that are

needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)

| What heritage-based organisations should focuson ontology creation

| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively

| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector

Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found

that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail

Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence

LEGITIMISING THE SEMANTIC WEB

INVESTMENT

Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen

as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector

Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow

At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained

10 DigiCULT

Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants

accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not

Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating

the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more

than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits

The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses

DigiCULT 11

CONCLUSION

Over the next five years the possibilities offeredby the Semantic Web will bring little near term

benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge

Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers

Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131

Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001

Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf

DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673

Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60

Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453

Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59

Hendler J 2001Agents and the Semantic Web in IEEE

Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating

Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680

Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28

McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80

Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf

Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453

RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication

Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468

Bibliography

12 DigiCULT

DEVELOPMENT OF THE SEMANTIC WEB

MUST BEGIN AT THE GRASS ROOTS LEVEL

AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS

By Joost van Kasteren

T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down

approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo

Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future

The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)

The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through

the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid

Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo

According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their

collections anew in a way that fits the ontologyThatis just too much workrsquo

Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge

There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo

Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description

MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives

The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)

Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo

Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl

DigiCULT 13

14 DigiCULT

I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific

writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now

lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another

The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo

It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian

DIGICULTrsquoS EXPERT 13 TANGLE

WITH THE SEMANTIC WEB

By Michael Steemson

Genesis ndash The Creation Birds and Fishes

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 5: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

6 DigiCULT

and exhibitions that not only display objects andsimple descriptions (drawn from metadata) but alsoallow for understanding relationships between objects(created by semantically interrelated metadata)TheSemantic Web community promises to assist inachieving this goal but the challenge for the herit-age institutions would be to first implement thenecessary data infrastructure

The challenge for the Semantic Web expertround table was or at least the DigiCULT Secretariatthought it was not to run into a debate betweenlsquotheoryrsquo and lsquopracticersquo In other words between whatacademic Semantic Web scholars and what practi-tioners from heritage institutions think needs to beaccomplished what is feasible and affordable andwhere to concentrate efforts For the discussionXML seemed to provide a good starting pointXML on the one hand is increasingly considered byheritage institutions as a key standard for publishingmetadata on the Web on the other hand it is a majorbuilding block for the Semantic Web It proveddifferent in a positive sense In the discussion wideuse of XML was taken for granted while the keyarea of interest that surfaced and was seen to bemost fruitful to explore was ontologies

OVERVIEW

Setting the context for this Issue the positionpaper looks into the requirements for achieving thegoals of the Semantic Web and assesses whether theavailable technologies will be able to deliver on whatthe advocates of the Semantic Web envisage as wellas whether the cultural heritage sector is in a positionto take substantial steps towards semantic interopera-bility It concludes with the argument that the sectoris more likely to be left behind due in particular tothe fact that for the institutions the rewards for thenecessary investments are still too nebulous

Janneke van Kersen from the Dutch DigitalHeritage Association in her interview with theDigiCULT Journalist suggests that despite thecloudy Semantic Web horizon there are medium-term benefits to be gained for heritage institutions

in taking steps towards the visionAnd she states thatit is up to associations like hers together with largerinstitutions to take the lead in this prove that pro-posed solutions work and support smaller institutionsin taking advantage of them On the other handNicola Guarino in his interview believes thatreaching the lsquorealrsquo Semantic Web lies in takinglsquothe fundamental routersquo of implementing genericontologies based on linguistics and logics withinthe Semantic Web fabric He also claims that evenincremental progress along this path can haveremarkable pay-offs

Michael Steemsonrsquos summary of the Darm-stadt Forum illustrates that the Semantic Web topicresembles a labyrinth with currently no definitemap or Ariadnersquos Thread at hand Building on themany technologies the Forum participants mentionedas some of the labyrinthrsquos angles we have added tothe summary a list of resources related to thesetechnologies

In an effort to raise the veil of mystery surround-ing the Semantic Web this issue includes an examplefrom the sector on the implementation of semanticinteroperability of metadata combined with a primerthat explains core building blocks such as XMLRDF and ontologiesWhile a detailed primer of forexample RDF would alone exhaust the limits of thisissue1 the goal here is to deliver an lsquoall-inclusiversquoprimer within the space permitted with all theinevitable limitations this entailsThe primer attemptsto provide a general understanding of the SemanticWeb architecture without obliging the reader towander through the long and perplexing corridors of language specifications

Finally we want to thank the KoninklijkeBibliotheek National Library of the Netherlandsfor their kind permission to use selected images fromtheir collection of illuminated medieval manuscripts2

We hope you will appreciate the little narratives theyrepresent within the overall fabric of this DigiCULTThematic Issue

1cf F Manola E Miller

RDF Primer (W3C Working

Draft 23 January 2003)

httpwwww3org

TRrdf-primer2See their online collection

of such images at

httpwwwkbnlkb

manuscripts which offers

advanced search and

presentation features

Tim Berners-Lee and his colleagues at W3Chave recognised that the real benefits of theweb-based information revolution will come

from enabling the interoperability of contentThecurrent generation of web delivery is they haveargued designed for human users who struggle tomake effective use of the billions of pages of infor-mation currently accessibleWhen we search forsomething at the moment we sometimes discoversuitable candidate information but more often thannot this is far from being the case More than thisthe entire process of searching discovery and use isdesigned to be driven by humansWhen we discoverone piece of the puzzle we need manually to positionthat information so that it can help us to search outthe next piece of the puzzleWe find that Darmstadtis near FrankfurtThen we find that there are flightsfrom Glasgow to Frankfurt and there is a bus fromFrankfurt Airport to DarmstadtThen I search fortimetables make manual comparisons and decidewhich times best meet my requirements In theShangri-La that is the Semantic Web my lsquoagentrsquo wouldrecognise from its regular review of my diary that I needed to be at a meeting in Darmstadt on the 21stof January 2003 and it would search out the options

DigiCULT 7

POSITION PAPER

By Seamus Ross

Genesis ndash The Creation Division of Light and Darkness

analyse the timetables identify the optimum travelarrangements book my non-smoking hotel accom-modation and order the taxi to take me to theairport (It might even check the weather forecastsand warn me to bring particular types of clothing)Certainly to make this happen there has to be afundamental shift in the way data information andknowledge are represented on the web

The proliferation of web-based resources makesfinding what you are looking for increasingly difficultAccording to Internet user studies in 1996 50 ofInternet users reported spending time looking forinformation without finding it but by 2002 onlyabout 40 of users ended their lsquosearching sessionsrsquounsuccessfullyAt first glance we might conclude thatweb discovery tools have improved andor theinformation searching skills of users have improvedOver the past seven years the quantity of content hasmushroomed the search tools have become moreefficient developers approach the use of meta-tagsmore effectively and anecdotal evidence suggests thatthe searching techniques of users have become moresophisticatedWe should continue to be surprised bythe high failure rate and wonder why it remainsproportionally so high as the numbers of users havegrown to nearly 600 million In reality there is justtoo much content available It is poorly described Itis not interconnected Search engines themselves areblunt instruments Most users of the web do not havevery mature searching strategies and rarely use eventhe blunt instruments as effectively as they mightAsolution is to make more of the information capableof discovery interpretation and reuse by automatedinformation processing tools themselves However thecurrent ways content is represented on the web makesit nearly impossible for machines to search the webmeaningfully and effectively ndash even with the limi-tations of their skills and tools humans are better atsearching the web than the most powerful of thecurrent generation of agentsThe emergence of theSemantic Web would solve this problem

The web has made us realise the tremendouspotential of digital resources and made them widelyavailable Content as presented on the web currently is

8 DigiCULT

Genesis ndash The CreationDivision of the Waters Above and Below the Firmament

mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web

The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo

TOWARDS AN INTEROPERABLE

SEMANTIC WEB FOR HERITAGE

RESOURCES

Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and

semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as

RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)

The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value

1 See the lsquoSemantic Web

Terms and Reading

Listrsquo in this Issue

DigiCULT 9

ONTOLOGIES ndash THE JEWELS

OF THE SEMANTIC WEB

For the Semantic Web to succeed it will requirenot only modelling languages such as XML

RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved

The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-

logies for the heritage sector| How can we prioritise the ontologies that are

needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)

| What heritage-based organisations should focuson ontology creation

| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively

| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector

Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found

that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail

Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence

LEGITIMISING THE SEMANTIC WEB

INVESTMENT

Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen

as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector

Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow

At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained

10 DigiCULT

Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants

accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not

Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating

the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more

than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits

The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses

DigiCULT 11

CONCLUSION

Over the next five years the possibilities offeredby the Semantic Web will bring little near term

benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge

Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers

Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131

Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001

Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf

DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673

Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60

Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453

Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59

Hendler J 2001Agents and the Semantic Web in IEEE

Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating

Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680

Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28

McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80

Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf

Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453

RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication

Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468

Bibliography

12 DigiCULT

DEVELOPMENT OF THE SEMANTIC WEB

MUST BEGIN AT THE GRASS ROOTS LEVEL

AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS

By Joost van Kasteren

T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down

approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo

Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future

The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)

The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through

the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid

Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo

According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their

collections anew in a way that fits the ontologyThatis just too much workrsquo

Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge

There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo

Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description

MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives

The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)

Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo

Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl

DigiCULT 13

14 DigiCULT

I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific

writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now

lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another

The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo

It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian

DIGICULTrsquoS EXPERT 13 TANGLE

WITH THE SEMANTIC WEB

By Michael Steemson

Genesis ndash The Creation Birds and Fishes

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 6: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

Tim Berners-Lee and his colleagues at W3Chave recognised that the real benefits of theweb-based information revolution will come

from enabling the interoperability of contentThecurrent generation of web delivery is they haveargued designed for human users who struggle tomake effective use of the billions of pages of infor-mation currently accessibleWhen we search forsomething at the moment we sometimes discoversuitable candidate information but more often thannot this is far from being the case More than thisthe entire process of searching discovery and use isdesigned to be driven by humansWhen we discoverone piece of the puzzle we need manually to positionthat information so that it can help us to search outthe next piece of the puzzleWe find that Darmstadtis near FrankfurtThen we find that there are flightsfrom Glasgow to Frankfurt and there is a bus fromFrankfurt Airport to DarmstadtThen I search fortimetables make manual comparisons and decidewhich times best meet my requirements In theShangri-La that is the Semantic Web my lsquoagentrsquo wouldrecognise from its regular review of my diary that I needed to be at a meeting in Darmstadt on the 21stof January 2003 and it would search out the options

DigiCULT 7

POSITION PAPER

By Seamus Ross

Genesis ndash The Creation Division of Light and Darkness

analyse the timetables identify the optimum travelarrangements book my non-smoking hotel accom-modation and order the taxi to take me to theairport (It might even check the weather forecastsand warn me to bring particular types of clothing)Certainly to make this happen there has to be afundamental shift in the way data information andknowledge are represented on the web

The proliferation of web-based resources makesfinding what you are looking for increasingly difficultAccording to Internet user studies in 1996 50 ofInternet users reported spending time looking forinformation without finding it but by 2002 onlyabout 40 of users ended their lsquosearching sessionsrsquounsuccessfullyAt first glance we might conclude thatweb discovery tools have improved andor theinformation searching skills of users have improvedOver the past seven years the quantity of content hasmushroomed the search tools have become moreefficient developers approach the use of meta-tagsmore effectively and anecdotal evidence suggests thatthe searching techniques of users have become moresophisticatedWe should continue to be surprised bythe high failure rate and wonder why it remainsproportionally so high as the numbers of users havegrown to nearly 600 million In reality there is justtoo much content available It is poorly described Itis not interconnected Search engines themselves areblunt instruments Most users of the web do not havevery mature searching strategies and rarely use eventhe blunt instruments as effectively as they mightAsolution is to make more of the information capableof discovery interpretation and reuse by automatedinformation processing tools themselves However thecurrent ways content is represented on the web makesit nearly impossible for machines to search the webmeaningfully and effectively ndash even with the limi-tations of their skills and tools humans are better atsearching the web than the most powerful of thecurrent generation of agentsThe emergence of theSemantic Web would solve this problem

The web has made us realise the tremendouspotential of digital resources and made them widelyavailable Content as presented on the web currently is

8 DigiCULT

Genesis ndash The CreationDivision of the Waters Above and Below the Firmament

mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web

The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo

TOWARDS AN INTEROPERABLE

SEMANTIC WEB FOR HERITAGE

RESOURCES

Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and

semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as

RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)

The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value

1 See the lsquoSemantic Web

Terms and Reading

Listrsquo in this Issue

DigiCULT 9

ONTOLOGIES ndash THE JEWELS

OF THE SEMANTIC WEB

For the Semantic Web to succeed it will requirenot only modelling languages such as XML

RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved

The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-

logies for the heritage sector| How can we prioritise the ontologies that are

needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)

| What heritage-based organisations should focuson ontology creation

| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively

| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector

Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found

that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail

Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence

LEGITIMISING THE SEMANTIC WEB

INVESTMENT

Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen

as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector

Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow

At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained

10 DigiCULT

Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants

accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not

Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating

the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more

than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits

The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses

DigiCULT 11

CONCLUSION

Over the next five years the possibilities offeredby the Semantic Web will bring little near term

benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge

Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers

Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131

Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001

Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf

DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673

Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60

Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453

Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59

Hendler J 2001Agents and the Semantic Web in IEEE

Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating

Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680

Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28

McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80

Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf

Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453

RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication

Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468

Bibliography

12 DigiCULT

DEVELOPMENT OF THE SEMANTIC WEB

MUST BEGIN AT THE GRASS ROOTS LEVEL

AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS

By Joost van Kasteren

T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down

approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo

Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future

The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)

The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through

the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid

Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo

According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their

collections anew in a way that fits the ontologyThatis just too much workrsquo

Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge

There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo

Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description

MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives

The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)

Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo

Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl

DigiCULT 13

14 DigiCULT

I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific

writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now

lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another

The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo

It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian

DIGICULTrsquoS EXPERT 13 TANGLE

WITH THE SEMANTIC WEB

By Michael Steemson

Genesis ndash The Creation Birds and Fishes

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 7: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

8 DigiCULT

Genesis ndash The CreationDivision of the Waters Above and Below the Firmament

mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web

The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo

TOWARDS AN INTEROPERABLE

SEMANTIC WEB FOR HERITAGE

RESOURCES

Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and

semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as

RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)

The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value

1 See the lsquoSemantic Web

Terms and Reading

Listrsquo in this Issue

DigiCULT 9

ONTOLOGIES ndash THE JEWELS

OF THE SEMANTIC WEB

For the Semantic Web to succeed it will requirenot only modelling languages such as XML

RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved

The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-

logies for the heritage sector| How can we prioritise the ontologies that are

needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)

| What heritage-based organisations should focuson ontology creation

| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively

| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector

Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found

that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail

Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence

LEGITIMISING THE SEMANTIC WEB

INVESTMENT

Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen

as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector

Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow

At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained

10 DigiCULT

Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants

accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not

Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating

the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more

than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits

The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses

DigiCULT 11

CONCLUSION

Over the next five years the possibilities offeredby the Semantic Web will bring little near term

benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge

Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers

Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131

Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001

Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf

DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673

Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60

Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453

Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59

Hendler J 2001Agents and the Semantic Web in IEEE

Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating

Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680

Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28

McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80

Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf

Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453

RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication

Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468

Bibliography

12 DigiCULT

DEVELOPMENT OF THE SEMANTIC WEB

MUST BEGIN AT THE GRASS ROOTS LEVEL

AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS

By Joost van Kasteren

T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down

approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo

Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future

The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)

The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through

the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid

Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo

According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their

collections anew in a way that fits the ontologyThatis just too much workrsquo

Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge

There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo

Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description

MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives

The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)

Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo

Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl

DigiCULT 13

14 DigiCULT

I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific

writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now

lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another

The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo

It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian

DIGICULTrsquoS EXPERT 13 TANGLE

WITH THE SEMANTIC WEB

By Michael Steemson

Genesis ndash The Creation Birds and Fishes

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 8: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

DigiCULT 9

ONTOLOGIES ndash THE JEWELS

OF THE SEMANTIC WEB

For the Semantic Web to succeed it will requirenot only modelling languages such as XML

RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved

The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-

logies for the heritage sector| How can we prioritise the ontologies that are

needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)

| What heritage-based organisations should focuson ontology creation

| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively

| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector

Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found

that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail

Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence

LEGITIMISING THE SEMANTIC WEB

INVESTMENT

Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen

as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector

Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow

At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained

10 DigiCULT

Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants

accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not

Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating

the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more

than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits

The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses

DigiCULT 11

CONCLUSION

Over the next five years the possibilities offeredby the Semantic Web will bring little near term

benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge

Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers

Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131

Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001

Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf

DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673

Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60

Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453

Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59

Hendler J 2001Agents and the Semantic Web in IEEE

Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating

Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680

Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28

McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80

Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf

Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453

RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication

Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468

Bibliography

12 DigiCULT

DEVELOPMENT OF THE SEMANTIC WEB

MUST BEGIN AT THE GRASS ROOTS LEVEL

AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS

By Joost van Kasteren

T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down

approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo

Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future

The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)

The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through

the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid

Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo

According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their

collections anew in a way that fits the ontologyThatis just too much workrsquo

Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge

There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo

Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description

MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives

The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)

Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo

Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl

DigiCULT 13

14 DigiCULT

I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific

writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now

lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another

The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo

It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian

DIGICULTrsquoS EXPERT 13 TANGLE

WITH THE SEMANTIC WEB

By Michael Steemson

Genesis ndash The Creation Birds and Fishes

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 9: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

10 DigiCULT

Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants

accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not

Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating

the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more

than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits

The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses

DigiCULT 11

CONCLUSION

Over the next five years the possibilities offeredby the Semantic Web will bring little near term

benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge

Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers

Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131

Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001

Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf

DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673

Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60

Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453

Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59

Hendler J 2001Agents and the Semantic Web in IEEE

Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating

Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680

Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28

McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80

Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf

Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453

RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication

Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468

Bibliography

12 DigiCULT

DEVELOPMENT OF THE SEMANTIC WEB

MUST BEGIN AT THE GRASS ROOTS LEVEL

AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS

By Joost van Kasteren

T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down

approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo

Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future

The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)

The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through

the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid

Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo

According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their

collections anew in a way that fits the ontologyThatis just too much workrsquo

Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge

There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo

Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description

MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives

The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)

Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo

Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl

DigiCULT 13

14 DigiCULT

I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific

writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now

lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another

The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo

It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian

DIGICULTrsquoS EXPERT 13 TANGLE

WITH THE SEMANTIC WEB

By Michael Steemson

Genesis ndash The Creation Birds and Fishes

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 10: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

DigiCULT 11

CONCLUSION

Over the next five years the possibilities offeredby the Semantic Web will bring little near term

benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge

Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers

Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131

Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001

Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf

DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673

Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60

Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453

Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59

Hendler J 2001Agents and the Semantic Web in IEEE

Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating

Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680

Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28

McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80

Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf

Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453

RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication

Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468

Bibliography

12 DigiCULT

DEVELOPMENT OF THE SEMANTIC WEB

MUST BEGIN AT THE GRASS ROOTS LEVEL

AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS

By Joost van Kasteren

T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down

approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo

Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future

The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)

The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through

the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid

Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo

According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their

collections anew in a way that fits the ontologyThatis just too much workrsquo

Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge

There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo

Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description

MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives

The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)

Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo

Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl

DigiCULT 13

14 DigiCULT

I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific

writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now

lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another

The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo

It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian

DIGICULTrsquoS EXPERT 13 TANGLE

WITH THE SEMANTIC WEB

By Michael Steemson

Genesis ndash The Creation Birds and Fishes

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 11: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

12 DigiCULT

DEVELOPMENT OF THE SEMANTIC WEB

MUST BEGIN AT THE GRASS ROOTS LEVEL

AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS

By Joost van Kasteren

T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down

approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo

Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future

The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)

The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through

the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid

Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo

According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their

collections anew in a way that fits the ontologyThatis just too much workrsquo

Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge

There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo

Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description

MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives

The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)

Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo

Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl

DigiCULT 13

14 DigiCULT

I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific

writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now

lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another

The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo

It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian

DIGICULTrsquoS EXPERT 13 TANGLE

WITH THE SEMANTIC WEB

By Michael Steemson

Genesis ndash The Creation Birds and Fishes

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 12: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

collections anew in a way that fits the ontologyThatis just too much workrsquo

Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge

There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo

Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description

MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives

The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)

Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo

Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl

DigiCULT 13

14 DigiCULT

I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific

writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now

lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another

The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo

It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian

DIGICULTrsquoS EXPERT 13 TANGLE

WITH THE SEMANTIC WEB

By Michael Steemson

Genesis ndash The Creation Birds and Fishes

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 13: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

14 DigiCULT

I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific

writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now

lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another

The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo

It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian

DIGICULTrsquoS EXPERT 13 TANGLE

WITH THE SEMANTIC WEB

By Michael Steemson

Genesis ndash The Creation Birds and Fishes

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 14: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need

The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets

In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency

The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information

The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo

THE DAZZLING PROSPECTS

Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well

Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo

Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo

Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo

LANGUAGE REPRESENTATION

HITCHESA CULTOS group had Behrendt explained taken

one of these incremental steps and built an ontology2

for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the

DigiCULT 15

1 T Berners-Lee J Hendler O

LassilaThe Semantic Web In

Scientific American May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe

branch of metaphysics that deals

ith the nature of being 2 Logic

he set of entities presupposed by

a theory Collins English Dictio-

aryThird edition Glasgow 1991

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 15: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

16 DigiCULT

Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg

ontology themselves should then use it to combinemultimedia assets with each otherrsquo

The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo

Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo

The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and

the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo

He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo

THE GALILEO CONUNDRUM

The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created

Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo

Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time

Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo

Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo

3 The museum and Web site

are rich resources for the

life and work of Galileo

httpgalileoimssfirenzeit4 Telos Corporation

httpwwwteloscom

AshburnVA US5 CIDOC International

Committee for Documentation

of the International

Council of Museums

httpwwwwillpowerinfomyby

coukcidocCIDOCe

(ICOM-CIDOC) Forum for

documentation interests of

museums and related

organisations one of 25

international committees

of the International Council

of Museums (ICOM)

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 16: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

CAPITALS AND ACRONYMS

The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo

He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo

Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved

DigiCULT 17

Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web

Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins

Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp

by projects like the Web Services of the OpenArchives Initiative (OAI)

Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 17: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

18 DigiCULT

Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo

Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language

Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-

visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case

ONTOLOGY TUTORIAL IN 800 WORDS

Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics

Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point

lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo

He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo

Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any

On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002

6 Office of the e-Envoy

httpwwwe-envoygovuk 7 Govtalk

httpwwwgovtalkgovuk8 Sesame environment

httpwwwontoknowledgeorg

toolsfactsheetSesamehtml 9 Mereology nThe formal study

of the logical properties of the

relation of part and whole

Collins English Dictionary

Third edition Glasgow 1991

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 18: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

development could begin basic classifications andmethodologies would be required to form a foun-dation for the work

Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo

Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo

THE ELUSIVE SEMANTIC GRAIL

By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo

DigiCULT 19

MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses

lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo

Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo

Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo

THE LUCK CHANGES

The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example

lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo

Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 19: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

20 DigiCULT

model is under review for adoption as an Inter-national Standards Organisation (ISO) publication

Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo

He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo

And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational

side has a high pay-off So you do not need to solveall the foundational issuesrsquo

The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo

There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo

The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators

They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo

So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo

Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo

That seemed to make it game set and match

CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual

Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard

lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo

httpcidocicsforthgrwhat_is_crmhtml

CHIOS - Cultural Heritage InterchangeOntology Standardization project

Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)

httpcidocicsforthgrchios_isohtml

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 20: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany

The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea

CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003

httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml

DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg

lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml

DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml

DigiCULT 21

Genesis ndash The Creation Stars and Fishes

SEMANTIC WEB TERMS

AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson

I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc

mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg

The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 21: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

22 DigiCULT

OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil

A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf

OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml

Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001

lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf

Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg

OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt

See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide

For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req

RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34

Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml

lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521

lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 22: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg

The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity

See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb

SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE

SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text

or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo

SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data

lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm

For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml

DigiCULT 23

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 23: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

24 DigiCULT

Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML

For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity

World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999

XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30

For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore

lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 24: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage

of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo

Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo

An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic

ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo

Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo

In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo

It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit

DigiCULT 25

SEMANTIC WEB SHOULD BE BASED

ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY

By Joost van Kasteren

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 25: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

26 DigiCULT

The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine

which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans

Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages

The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised

Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue

How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)

The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal

The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including

A CULTURAL HERITAGE SEMANTIC WEB

EXAMPLE amp PRIMERBy Guntram Geser

1 Tim Berners-Lee

Interpretation and Semantics on

the Semantic Web (1998)

httpwwww3orgDesignIssues

Interpretationhtml2 Tim Berners-Lee Semantic

Web Road Map (1998)

httpwwww3org

DesignIssuesSemantichtml3 Robert DuCharme

httplistsxmlorgarchivesxml-

dev 200211 msg00190html

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 26: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)

The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums

The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles

In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata

RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3

Yet given the importance of metadata for theSemantic Web vision in general it does not come as

a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore

The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)

At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture

Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository

In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative

The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document

XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags

DigiCULT 27

FMS Documents

A short presentation is provided

in Eero Hyvoumlnen et al

Cultural Semantic Inter-

operability on the Web Case

Finnish Museums Online

tpiswc2002semanticweborg

postershyvonen_a4pdf for

detailed descriptions seeVilho

Raatikka Eero Hyvoumlnen

Ontology-based Semantic

Metadata Validation and

Hyvoumlnen Eero et al Semantic

Interoperability on the Web

Case Finnish Museums Online

Both texts can be found in

Towards the Semantic Web and

eb Services Proceedings of the

ML Finland 2002 Conference

httpwwwcshelsinkifiu

eahyvonexmlfinland2002

ProceedingsXML2002-finalpdf

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 27: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

ITEM_VIEW

item_id

type

subject

iconclass

creator

manuscript

place

year

28 DigiCULT

XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator

XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements

The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)

Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below

A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following

(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset

(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt

Database rows Item data fromXML Documentgrouped by item database rows

image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo

Rowsto XMLprocess

(1) ltxml version=10 encoding=ISO-8859-1gt

(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo

image_id=rdquoimage5kb78d38irdquogt

(3) ltmitypegtColumn Miniatureltmitypegt

(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt

(5) ltmiiconclassgt71A3421ltmiiconclassgt

(6) ltmicreatorgtAlexander Masterltmicreatorgt

(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt

(8) ltmiplacegtUtrechtltmiplacegt

(9) ltmiyeargtcirca 1430ltmiyeargt

(10) ltmiimagegt

image5kb78d38ixml

Graphic 1 Database rows to XML process

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 28: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo

The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)

(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record

(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements

Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)

Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level

XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a

document| which elements are child elements as well as

their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements

and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings

Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each

DigiCULT 29

(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring

use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 29: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

30 DigiCULT

of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)

Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale

XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated

XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems

XML is also non-application specific ie it can be

used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity

Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web

XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information

Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML

Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4

OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages

Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5

If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and

4 The available documents (in

English) on the FMS initiative

state that their ontology is being

created using RDF Schema

(RDFS)To develop a fully

fledged ontology advanced

languages such as DAML+OIL

or Web Ontology Language

(OWL) would be required5For more elaborate and formal

descriptions see Tom Gruber

What is an Ontology (1995)

httpwww-kslstanfordedu

kstwhat-is-an-ontologyhtml

Nicola Guarino Ontology-

Driven Conceptual Modelling

part 1-3 (2002)

httpontologyiprmcnrit

Tutorials

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 30: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service

The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline

In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7

For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system

An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9

The W3Crsquos Resource Description Framework

(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10

Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo

A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage

On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies

Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12

DigiCULT 31

7 Bible71 Old Testament

71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve

71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve

71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body

6Upper-level ontologies

describe the basic concepts and

relationships invoked when

information about any domain is

expressed in natural language7For in-depth information see

the official Iconclass Website

httpwwwiconclassnl8Medieval Illustrated

Manuscripts Website

httpwwwkbnlkb

manuscripts browser

The subject access system for

the Website was conceived by

Mnemosyne Partners building on

the Iconclass classification system

nd technologies See the valuable

information they provide at

httpwwwmnemosyneorg

businessmsstempexhtml9A detailed description

of the AAT is provided at

httpwwwgettyeduresearch

toolsvocabularyaatabouthtml10httpwwww3orgTR

000CR-rdf-schema-2000032711See the information box

at the end of the Forum

discussion and the sources

mentioned in the Semantic

Web Terms and Reading List12Michael Denny Ontology

BuildingA Survey of

Editing Tools (06-11-2002)

httpwwwxmlcompuba

20021106ontologieshtml

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 31: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

32 DigiCULT

Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)

lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml

RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between

| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources

| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource

| statements these associate a value for a named property with the resource

Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate

A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines

httpwwwm-iorgimagesschemasImage

httpwwwm-iorgschemasimagesColumnMiniature

Subject

Predicate

Object

With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image

The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema

The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages

For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF

Graphic 2 RDF Data Model

httpwwwm-iorgimagesschemasMiniature

httpwwww3org200001rdf-schemasubClassOf

Subject

Predicate

Object

httpwwww3org200001rdf-schemasubClassOf

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 32: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card

RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)

When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt

mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema

Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation

Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way

RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)

In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions

Defining classes

With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images

DigiCULT 33

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 33: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

34 DigiCULT

drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass

So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages

In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass

Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature

As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage

Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model

Defining properties

In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures

Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms

rdfsrange

The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]

The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-

ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature

rdfsdomain

The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature

Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF

will encourage the providing of metadata about Internet resources

| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce

| The standard syntax and query capability will allowapplications to exchange information more easily

| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering

| Intelligent software agents will have moreprecise data to work withrsquo13

This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF

13 whatiscom

searchWebServicescom

Definitions - Resource

Description Framework

httpsearchwebservices

techtargetcomsDefinition

0sid26_gci21354500html

(last updated July 27 2001)

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 34: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases

However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server

The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata

How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser

One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found

The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections

From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users

The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14

This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)

While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology

However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality

Intelligent Software Agents

The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15

Agent

lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo

Intelligent agent

lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo

Flexible autonomous action

lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains

an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo

| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo

| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are

| Mobility the ability to move around an electronic network

| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)

| Learning an agent will improve its performance over time

DigiCULT 35

14 T Berners-Lee J Hendler

O Lassila Scientific American

May 2001

httpwwwsciamcom2001

0501issue0501berners-leehtml15 MWooldridge

An Introduction to

Multiagent Systems

ChichesterWiley 2002 and

httpwwwcsclivacuk

~mjwpubsimas

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 35: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg

Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details

httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer

XML repository

OntologyRDFS

K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19

Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml

S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf

36 DigiCULT

The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up

User ClientWWW BrowserTopic-based navigationView-based filtering

Server Ontogator software

RDF database

Semantic graphKnowledge space ofshared ontology andmetadata

RDF Schema Semanticinteroperability

XML Schema Syntacticinteroperability

Relational SchemasDBMS

Web crawler

RDF instances

Collectiondatabase 1

Metadata Editor XML repository XML repository

RDF instances RDF instances

Collectiondatabase 2

Collectiondatabase n

Graphic 3 Set-up of the Finnish Museums on the Semantic Web System

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 36: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

38 DigiCULT

Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat

Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml

Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)

Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki

THE DARMSTADT FORUM

PARTICIPANTS

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 37: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom

Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in

the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom

Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit

Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht

DigiCULT 39

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 38: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

40 DigiCULT

University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl

Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit

Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk

Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl

Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 39: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm

Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the

European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk

Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases

Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit

DigiCULT 41

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 40: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

42 DigiCULT

DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)

In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo

DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001

Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo

For further information on DigiCULT pleasecontact the team of the project co-ordinator

Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303

Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247

Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat

Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk

The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom

DIGICULT PROJECT INFORMATION

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 41: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002

DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO

DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany

DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003

IMPRINT

This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)

AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy

ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands

Graphics amp LayoutJan Steindl Salzburg Research

ISBN 3-902448-00-8Printed in Austriacopy 2003

DigiCULT 43

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 42: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

44 DigiCULT

IMAGES

Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao

Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master

Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa

Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r

Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55

Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55

Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41

Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125

Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135

copyKoninklijke BibliotheekThe HagueUsed with permission

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8

Page 43: Towards a Semantic Web for Heritage Resources€¦ · Semantic Web should be based on Well-founded Ontologies 25 Guntram Geser A Cultural Heritage Semantic Web Example & Primer 26

Towards a Semantic Web for

Heritage Resources

Thematic Issue 3 May 2003

DigiCULT Consortium

wwwdigicultinfo ISBN 3-902448-00-8