emtacl12 - NTNU€¦ · emtacl12 . Consert Book Newspaper Radio show TV show is interviewed in has...

Post on 17-May-2020

2 views 0 download

Transcript of emtacl12 - NTNU€¦ · emtacl12 . Consert Book Newspaper Radio show TV show is interviewed in has...

emtacl12

Consert

Book

Newspaper

Radio show

TV show

is interviewed in

has written has participated in

has played

Broadcasting domain Text domain

Person

Music domain

Song

has created

is played in is mentioned in

is reviewed in

is played in

is reviewed in

Band

is member of

black metal

repository

Last FM

DBpedia

MusicBrainz

BBC Music

Deichman

record level registrator

registration standard (AACR2 etc)

schema level structure

semantics

mapping

repository level (Linked data level) cross collection retrieval

Chan, L. M., & Zeng, M. L. (2006). Metadata interoperability and standardization: A study of methodology part I. D-Lib Magazine, 12(6).

how to make an infrastructure for searching and browsing Norwegian black metal, based on existing library data?

problem I: incomplete metadata collections

problem II: heterogenous metadata and metadata schemas, locally and externally

problem III: insufficient and inconsistent use of identifiers

sollution: linked data (problem I) + mapping to RDF (problem II) + graph matching (problem II/III)

seed collection: the national discography (Nordisko)

retrieve MARC(XML) records (z39.50/OAI/SRU) matching a list of preselected black metal bands

RDF

make a simple ”black metal” ontology based on existing vocabularies and convert the MARC records into RDF triples using XSLT upload the triples to a Virtouso triple store

graph matching

use SPARLQ (and PHP) to match clusters of nodes and edges in our seed data against similiar clusters in rich target collections providing SPARQL endpoints

use matching data in target collection for cleaning up and enriching metadata in seed collection

Burzum

Darkthrone

Emperor

Gorgoroth

Immortal

Satyricon

Thorn

99 MARCXML-records were retrieved from Nordisko as a respons to queries based on pre-selected black metal bands -why black metal? -complex interlinking!

<marc:datafield tag="700”>

<marc:subfield code="a">Maniac</marc:subfield>

</marc:datafield>

<marc:datafield tag="700”>

<marc:subfield code="a">Blasphemer</marc:subfield>

</marc:datafield>

<marc:datafield tag="700”>

<marc:subfield code="a">Hellhammer</marc:subfield>

</marc:datafield>

<marc:datafield tag="700”>

<marc:subfield code="a">Necrobutcher</marc:subfield>

</marc:datafield>

<marc:datafield tag="710”>

<marc:subfield code="a">Mayhem</marc:subfield>

</marc:datafield>

<marc:datafield tag="740”>

<marc:subfield code="a">Carnage</marc:subfield>

</marc:datafield>

<marc:datafield tag="740”>

<marc:subfield code="a">Necrolust</marc:subfield>

</marc:datafield>

<marc:datafield tag="740”>

<marc:subfield code="a">Deathcrush</marc:subfield>

</marc:datafield>

<marc:datafield tag="740”>

<marc:subfield code="a">Ancient skin</marc:subfield>

</marc:datafield>

<marc:datafield tag="740”>

<marc:subfield code="a">Freezing moon</marc:subfield>

</marc:datafield>

<marc:datafield tag="740”>

<marc:subfield code="a">Fall of seraphs</marc:subfield>

</marc:datafield>

<marc:datafield tag="740”>

<marc:subfield code="a">Chainsaw gutsfuck<marc:subfield>

</marc:datafield>

<marc:datafield tag="900">

<marc:subfield code="a">Eriksen, Rune</marc:subfield>

<marc:subfield code="z">Blasphemer</marc:subfield>

</marc:datafield>

<marc:datafield tag="900”>

<marc:subfield code="a">Stubberud, Jørn</marc:subfield>

<marc:subfield code="z">Necrobutcher</marc:subfield>

</marc:datafield>

<marc:datafield tag="900”>

<marc:subfield code="a">Kristiansen, Sven-Erik<marc:subfield>

<marc:subfield code="z">Maniac</marc:subfield>

</marc:datafield>

<marc:datafield tag="900”>

<marc:subfield code="a">Blomberg, Jan Axel</marc:subfield>

<marc:subfield code="z">Hellhammer</marc:subfield>

</marc:datafield>

<marc:datafield tag="110”>

<marc:subfield code="a">Kvikksølvguttene</marc:subfield>

</marc:datafield>

<marc:datafield tag="245”>

<marc:subfield code="a">Krieg</marc:subfield>

<marc:subfield code="h">lydopptak</marc:subfield>

</marc:datafield>

<marc:datafield tag="505”>

<marc:subfield code="a">Innhold: In den Arsch gefickt / Kvikksølvguttene. Torture/ Kvikksølvguttene, Vomit. Krieg / Kvikksølvguttene, Vomit (Ztalin, elgitar). More murder / Kvikksølvguttene (Ztalin, elgitar). Anger / Kvikksølvguttene. Ghoul / Kvikksølvguttene, Mayhem. Sluts / Kvikksølvguttene (Ztalin, elgitar). Violent death / Kvikksølvguttene. Fisted sisters / Kvikksølvguttene. Naglekamp / Kvikksølvguttene (Ztalin, elgitar)</marc:subfield>

</marc:datafield>

<marc:datafield tag="700”>

<marc:subfield code="a">Necro</marc:subfield>

</marc:datafield>

<marc:datafield tag="700”>

<marc:subfield code="a">Zathan</marc:subfield>

</marc:datafield>

<marc:datafield tag="700”>

<marc:subfield code="a">Ztalin</marc:subfield>

</marc:datafield>

<marc:datafield tag="700”>

<marc:subfield code="a">H.M.P.D.K.</marc:subfield>

</marc:datafield>

<marc:datafield tag="700”>

<marc:subfield code="a">Andreassen, Ole Petter</marc:subfield>

</marc:datafield>

<marc:datafield tag="710”>

<marc:subfield code="a">Vomit</marc:subfield>

<marc:subfield code="t">Krieg</marc:subfield>

</marc:datafield>

<marc:datafield tag="710”>

<marc:subfield code="a">Mayhem</marc:subfield>

<marc:subfield code="t">Ghoul</marc:subfield>

</marc:datafield>

<marc:datafield tag="710”>

<marc:subfield code="a">Vomit</marc:subfield>

<marc:subfield code="t">Torture</marc:subfield>

</marc:datafield>

<marc:datafield tag="710”>

<marc:subfield code="a">Kvikksølvguttene</marc:subfield>

<marc:subfield code="t">Krieg</marc:subfield>

</marc:datafield>

<marc:datafield tag="710”>

<marc:subfield code="a">Kvikksølvguttene</marc:subfield>

<marc:subfield code="t">Ghoul</marc:subfield>

</marc:datafield>

challenges

«Dauði Baldrs» «Hermoðr á Helferð» «Bálferð Baldrs» «Í Heimr Heljar» «Illa Tiðandi» «Móti Ragnarokum»

Erickson, Rune Eriksen, Rune Espedal, Kristian Eivind Euronymous Fachtal, Arataus Faust Fenris Fenriz Finstad, Børge Frost Gaahl Garbarek, Anja Goat Goatpervertor Greifi Grishnack Greishnackh, Greifi Grim Grishnackh, Greifi Grutle, Kjetil H.M. Daiomonion H.M.P.D.K. Haraldstad, Kjell Vidar Haraldstad, Kjetil Vidar

ambiguity from resource, registrator, registration standard, metadata structure, ontology or transformation?

browsable data: bibin.hioa.no/blackmetal

comparing graph structures/ontologies

pattern recognition

semantic correspondences

Raimond, Y., Sutton, C., & Sandler, M. (2008). Automatic interlinking of music datasets on the semantic web. Linked Data on the Web (LDOW2008).

A B

r

r

r

r

r

s s s

s s

rx sy

rx = http://blackmetal.no/mayhem

black metal repository MusicBrainz

sy = http://musicbrainz.org/artist/c5f9e699-7b0d-4030-86dd-7acc8250d147

owl:sameAs

?

Problem

4 Artist

”Mayhem”

foaf:name

7 Artist

”Mayhem”

foaf:name

G2 G3

1 Artist

”Mayhem”

foaf:name

G1

= =

Graph matching

matching literals

comparing literals directly G1:”Mayhem” (node 1) G2:”Mayhem” (node 4) G1:”Deathcrush” (node 2) G2:”Deathcrush” (node 5) G1:”De Mysteriis Dom Sathanas” (node 3) G2:”De Mysteris Dom Sathanas” (node 6) G1:”Mayhem” (node 1) G3:”Mayhem” (node 7) G1:”Deathcrush” (node 2) G3:”Gentle murder” (node 8) G1:”De Mysteriis Dom Sathanas” (node 3) G3:”Pulling Puppet Strings” (node 9)n n literal1 literal2 similarity

1 4 1

2 5 1

3 6 0,9

1 7 1

2 8 0,2

n

1 Artist

3 Album

2 Album

”Mayhem”

”De Mysteriis Dom

Sathanas”

”Deathcrush”

dc:creator

foaf:name

foaf:name

4 Artist

6 Album

5 Album

”Mayhem”

”De Mysteris Dom

Sathanas”

”Deathcrush”

foaf:made

foaf:name

foaf:name

7 Artist

9 Album

8 Album

”Mayhem”

”Pulling Puppet Strings”

”Gentle

murder”

foaf:made

foaf:name

foaf:name

G1 G2 G3

black metal repository (collection A) Musicbrainz (collection B)

basic similarity measure for graphs:

graphs matching similarity

G1 G2 MG1:G2a = (1, 4), (2, 5), (3, 6) (1+1+0,9)/3=0,96

G1 G2 MG1:G2b = (1, 4), (2, 6), (3, 5) (1+0,2+0,2)/3=0,46

G1 G3 MG1:G3a = (1, 7), (2, 8), (3, 9) (1+0,2+0,1)/3=0,43

n

: existing interoperability problems at different levels

: Linked data+graph matching provides

disambiguation both locally and externally

tool for cleaning up local metadata

automatic interlinking

extended local data collection

thank you! on behalf of Kim Tallerås (kim.talleras@hioa.no) Nils Pharo Jørn-Helge Dahl David Massey