Metadata matters - The Galley Club · Linked data • expresses metadata as a collection of triples...

23
Metadata matters Graham Bell EDItEUR Galley Club 2nd April 2014

Transcript of Metadata matters - The Galley Club · Linked data • expresses metadata as a collection of triples...

Metadata matters

Graham BellEDItEUR

Galley Club

2nd April 2014

About me

• 20 years experience at the point where book�publishing and technology meet

• formerly senior manager in IT department for HarperCollins UK• led development of bibliographic, editorial and

digital asset management systems

• involved in e-book, e-audio, print-on-demand

and online projects

• joined EDItEUR in mid-2010, primarily responsible for ONIX development

About EDItEUR

• not-for-profit membership organisation

• develops, supports and promotes metadata and identification standards for the book,e-book and serials supply chains

• acknowledged centre of expertise on standards and metadata for the industry

• based in London, but a global membership of publishers, distributors, wholesalers, subscription agents, retailers, libraries, system vendors, rights organizations and trade associations

About EDItEUR

• also provides management services to International ISBN, ISTC, ISNI Agencies

• EDItEUR has three full-time staff, one FTE part-time staff, plus access to consultants from both the book and serials sectors

• we also work closely with other standards organisations, to ensure our standards meet the needs of their stakeholders too

• member participation is vital to ensure that standards keep pace with evolving business requirements

The history of book metadata,

in three slides

“I read a book once. Green, it was.”

<ProductFormFeature>

<ProductFormFeatureType>01</ProductFormFeatureType>

<ProductFormFeatureValue>GRN</ProductFormFeatureValue>

</ProductFormFeature>

colour of cover

green

<ProductFormFeature>

<ProductFormFeatureType>01</ProductFormFeatureType>

<ProductFormFeatureValue>GRN</ProductFormFeatureValue>

</ProductFormFeature>

colour of cover

green

Title: The fire engine that disappearedAuthor: Maj SjöwallISBN: 978-0-00-783533-1Pub date: 07-08-2007

Title: The man on the balconyAuthor: Maj SjöwallISBN: 978-0-00-724293-1Pub date: 15-01-2007

Title: RoseannaAuthor: Maj SjöwallISBN: 978-0-00-723283-3Pub date: 07-08-2006

<ProductIdentifier> <ProductIDType>15</ProductIDType> <IDValue>9780007232833</IDValue></ProductIdentifier>

<Contributor> <SequenceNumber>1</SequenceNumber> <ContributorRole>A01</ContributorRole> <NameIdentifier> <NameIDType>16</NameIDType> <IDValue>0000000121479135</IDValue> </NameIdentifier> <PersonNameInverted>Sjöwall, Maj��������</PersonNameInverted> <BiographicalNote textformat="05"><p>Maj��������Sjöwall is a poet. She lives in Sweden.</p>��������</BiographicalNote></Contributor>

<TitleDetail> <TitleType>01</TitleType> <TitleElement> <TitleElementLevel>01</TitleElementLevel> <NoPrefix/> <TitleWithoutPrefix>Roseanna</TitleWithoutPrefix> </TitleElement></TitleDetail>

title type 01

from List 15

predicate

role A01

from List 17

book

contributor

"Sjöwall, Maj"

"0000000121479135"

name type 01

from List 18

subject object

ID type 16

from List 44

"Roseanna"

Linked data

• expresses metadata as a collection of triples

• uses URIs to represent relations and entities

• prefers persistent HTTP URIs so they can be ‘looked up’ to get further details

• the data can be ‘self-describing’

• is intended to be flexible and extensible, because it’s ‘schemaless’

• isn’t new

• is not intended for human consumption

Linked data

• expresses metadata as a collection of triples

• uses URIs to represent relations and entities

• prefers persistent HTTP URIs so they can be ‘looked up’ to get further details

• the data can be ‘self-describing’

• is intended to be flexible and extensible, because it’s ‘schemaless’

• isn’t new

• is not intended for human consumption

• Uniform Resource Identifier – two types…

• Uniform Resource Name

• urn:isbn:9780001234567

• Uniform Resource Locator

• http://dx.doi.org/10.978.000/1234567

http://

harpercollins.co.uk/

360366

http://ns.editeur.org/

onix/codelist/15#01"Roseanna"

http://

harpercollins.co.uk/

360366

http://ns.editeur.org/

onix/codelist/17#A01genid:A96

genid:A96http://ns.editeur.org/

onix/codelist/44#16"0000000121479135"

genid:A96http://ns.editeur.org/

onix/codelist/18#01"Sjöwall, Maj"

http://

harpercollins.co.uk/

360366

http://ns.editeur.org/

onix/codelist/5#15

urn:isbn:

9780007232833

• linked data is often confused with the concept of ‘Linked�Open�Data’

• linked data, but with an ideological view that

data should be open and freely available for

anyone to reuse without restrictions

• much of the data that’s claimed to be LOD is

in�fact data where the owner has omitted to

post a licence governing its reuse

• most of the truly open data is often data

produced by the public sector

An aside

• linked data is often confused with the concept of ‘Linked�Open�Data’

• linked data, but with an ideological view that

data should be open and freely available for

anyone to reuse without restrictions

• much of the data that’s claimed to be LOD is

in�fact data where the owner has omitted to

post a licence governing its reuse

• most of the truly open data is often data

produced by the public sector

An aside

http://lod-cloud.net/

Linking data

• linking depends on shared entities and concepts

• we need to be sure we are really talking about the

same thing – not just two people with the same

name, but actually the same contributor

• we need controlled vocabularies, taxonomies and

clear, shared semantics based on shared data

models, and good public identifiers

• linked data needs careful data modelling, and a concern for semantics and identity – and it’s technically challenging. There’s a steep learning curve

Linked data promise

• just another way of expressing the same data

• book metadata is already ‘data with links’

• linked data removes the limits of ‘the record’

• but optimised for machines, not people

• Tim Berners-Lee’s ‘semantic web’

• allows machines to ‘browse’ for other data – so

if your book is set in Sweden, then it can

automatically be linked to other sources of data

about Sweden or to other Swedish things

• could change the game of ‘discoverability’

[email protected]://www.editeur.org