Linked data: The play’s the thing Ed Jones, National University (San Diego) ALA Annual Conference...

39
Linked data: The play’s the thing Ed Jones, National University (San Diego) ALA Annual Conference (New Orleans)

Transcript of Linked data: The play’s the thing Ed Jones, National University (San Diego) ALA Annual Conference...

Linked data: The play’s the thing

Ed Jones, National University (San Diego)ALA Annual Conference (New Orleans)

Waiting for the Semantic Web

VLADIMIR: Well, shall we move to the Semantic Web?

ESTRAGON: Yes, let’s move to the Semantic Web.

[They don’t move.]

[apologies to Samuel Beckett]

outline

1. The playground2. Playground rules3. The players

1. Playing nice2. Playing sort of nice3. Playing not so nice

The playground[Linked open data cloud diagram, by Richard Cyganiak and Anja Jentzsch,

http://lod-cloud.net/]

Who we play with

Playground rules

Ranganathan’s first law of linked data:

Data is for use

[or, for the true die-hard, Data are for use]

Corollary 1 to Ranganathan’s first lawFunctional granularity BISG DP on ISTC:

What gets an ISTC? Moby-Dick alternatives:

1. Every version is Moby-Dick (one ISTC)2. All versions derive from an Ur-parent (Melville scholar)

(one ISTC for Ur-parent, one ISTC for each derivative) 3. Some versions derive from different texts (librarian) (one ISTC

for Ur-parent, one ISTC for each derivative text)4. Some versions are augmented by introduction and notes that

are separate works (“an even more pedantic librarian, dancing angels on the head of a pin”) (one ISTC for …, one ISTC for each component (introduction, biographical note, etc.)

Corollary 2 to Ranganathan’s first law

“If you build it, they will come”

1. There is (or will be [maybe, hopefully]) a lot of linkable data out there

2. Others will want some of our data and make links

3. We will want some of theirs and make links

The players Martin Prince (Playing nice) Ralph Wiggum (Playing sort of nice) Nelson Muntz (Not playing nice)

Martin Prince: Playing nice

©2009 Twentieth Century Fox Film Corporation

Playing nice: Tim Berners-Lee’s rules for linked data

1. Use URIs as names for things. 2. Use HTTP URIs so that people can look up

those names. 3. When someone looks up a URI, provide

useful information, using the standards (RDF*, SPARQL).

4. Include links to other URIs so that they can discover more things.

How we play: Group 3

How we play: Group 3

How we play: Group 2

How we play: Group 2

How we play: RDA element sets and value vocabularies

How we play: RDA element sets and value vocabularies

Wie sie spielen: GND

Summary: Martin (Sir Tim’s rules)

1. There are some nice resource files of FRBR Group 2 and 3 entities available in RDF

2. RDA-specific vocabularies are making headway

3. The Germans are eating our lunch

Ralph Wiggum: Playing sort of nice

©2009 Twentieth Century Fox Film Corporation

Functional granularity

How much granularity in RDA?Too little?Too much? Just right?

Too little granularity Sometimes it may be more useful to express

an attribute at a more granular level

RDA 7.12 Language of the content (captions) RDA 7.14 Accessibility content (closed captions)

[041]17 $a en $j de $2 iso639-1 [546]\\ $a Closed captioning in German.

http://RDVocab.info/Elements/languageOfTheContentExpression

http://RDVocab.info/Elements/accessibilityContentExpression

Too little granularity RDA 2.15.1.4: Record identifiers in accordance

with any prescribed display format; otherwise precede the identifier with a relevant trade name or agency

020 ISBN (scope: binding / publisher / unit / acidity / e-book format / etc.)

022 ISSN 024 Lots of others (ISMN / EAN / DOI / GTIN-

14 / etc.)

http://RDVocab.info/Elements/identifierForTheManifestation

Too little or too much granularity?

RDA 2.8.2.3 Recording Place of Publication RDA 2.20.7.3 Details Relating to Publication Statements

[260] $a Nizhny Novgorod : $b Izd-vo “Spasibo!”, $c 2008. [500] $a Published in Nizhny Novgorod, South Carolina.

http://RDVocab.info/Elements/placeOfPublicationManifestation

http://RDVocab.info/Elements/noteOnPublicationStatementManifestation

Too much granularity? RDA 2.7 Production Statement 260 $a - c

RDA 2.7.2 Place of Production 260 $a RDA 2.7.3 Parallel Place of Production 260 $a RDA 2.7.4 Producer’s Name 260 $b RDA 2.7.5 Parallel Producer Name 260 $b RDA 2.7.6 Date of Production 260 $c

RDA 2.8 Publication Statement 260 $a - c Ditto

RDA 2.9 Distribution Statement 260 $a - c Ditto

ISBD RDF http://iflastandards.info/ns/isbd/elements/

[property]

hasPlaceOfPublicationProductionDistribution<info:lccn/ca35000361> <isbd:P1016> “Paris”

hasNameOfPublisherProducerDistributor<info:lccn/ca35000361> <isbd:P1017> “Pagnerre”

hasDateOfPublicationProductionDistribution<info:lccn/ca35000361> <isbd:P1018> “1862”

hasPublicationProductionDistributionEtcArea<info:lccn/ca35000361> <isbd:P1162> “Paris : Pagnerre, 1862.”

How do we use the Publication, etc., area?

Sometimes others have more granular metadata RDA: Publisher’s name (transcribed string

from preferred source, may be publisher name or publisher imprint)

ONIX: Publisher (controlled name) ONIX: Imprint (controlled name)

Example: “China's Multilateral Co-operation in Asia and the Pacific” Taylor & Francis Group publishes it under its

Routledge imprint [260] $a Milton Park, Abingdon, Oxon ; $a New

York : $b Routledge, $c c2010.

And even more granular… <Publisher>   <PublishingRole> <b291> [List 45]

01=Publisher02=Co-publisher03=Sponsor05=Host/distributor of electronic content06=Published for/on behalf of07=Published in association withetc.

  <NameCodeType> <b241> [List 44]   <NameCodeTypeName> <b242>   <NameCodeValue> <b243>   <PublisherName> <b081> </Publisher>

Summary: Ralph RDA granularity may be like in Goldilocks

and the Three Bears:

1. Some elements maybe too granular2. Some elements maybe not granular

enough3. Some (probably most) elements just right

Nelson Muntz: Not playing nice

©2009 Twentieth Century Fox Film Corporation

Legacy format

An analogy

RDA = Acela Express

MARC 21 = Too-close tracks and outdated catenary support

Legacy data

Legacy data Links to resources (FRBR Group 1):

Work/expression possibly via machine population of subfield $0

Otherwise manifestation level only, mainly serials/IR only (fields 760-787)

Links to agents (FRBR Group 2): Probably via machine population of subfield $0

Links to subjects (FRBR Group 3): Probably via machine population of subfield $0

Links to other vocabularies: Nope

Where will Area 4 go? AACR2 X.4 (MARC 260) Publication, distribution, etc., area

Place of publication, distribution, etc. Publisher, distributor, etc. Etc.

Maps to … ???

RDA 2.7 Production Statement Place of Production Producer’s Name Etc.

RDA 2.8 Publication Statement Ditto

RDA 2.9 Distribution Statement Ditto

Summary What to link?

Whatever we find useful (functional granularity) How does RDA play with the SW?

Potentially well. However… We have to get on the SW to play there We need appropriate levels of granularity to

accommodate / exploit existing data We need to link more outside our little corner of the

linked data cloud

I ♥ ISBD

Новая эстонская новелла : 1990-е годы : Пер. с эстон. / Сост.: Пирет Вийрес; Послесл.: Каяр Прууль. - Таллинн : Aleksandra, Cop. 1999. - 302, [1] с. : портр.; 21 см. - (Библиотека журнала Таллинн; 7).ISBN 9985-827-41-4Художественная литература -- Эстония -- Эстонская литература -- 2-ая пол. 20 в. -- Рассказы -- Сборник разных авторовХранение: 2P 8/44-2;

Not sure what this means