Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

132
LINKED OPEN DATA FOR CULTURAL HERITAGE VLADIMIR ALEXIEV [email protected] 2016-09-29 2D presentation: , , O for overview H for help normal continuous HTML

Transcript of Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

Page 1: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

LINKED OPEN DATA FOR CULTURAL HERITAGEVLADIMIR ALEXIEV

VLADIMIRALEXIEVONTOTEXTCOM

2016-09-29

2D presentation O for overview H for help normal continuous HTML

TABLE OF CONTENTS1 Intro

11 GLAM vs Internet12 Google NGrams Phrases in Books13 Google NGrams Two Speci c Orgs14 Google Trends Search Popularity15 How To Survive in the Internet Age16 Why Linked Open Data (LOD) is Important

2 GLAM Content Standards21 Museum Content Standards22 Archival Content Standards23 Library Content Standards

3 GLAM Metadata Schemas31 Seeing Standards (2)32 XML Schemas33 Museum Metadata CDWA34 Archive Metadata35 Library Metadata MARC

4 GLAM Ontologies41 Europeana Data Model42 CIDOC CRM43 Web Annotation (Open Annotation OA)44 International Image Interop Framework (IIIF)45 Library Ontologies46 Archival Ontologies

5 GLAM LOD Datasets (LODLAM)51 Wikidata52 VIAF53 Global Authority Control

6 LODLAM Projects

1 INTROA bit about me co-founder of Sirma Group Holding Bulgarias largest software groupand parent company of Ontotext

30y in IT 8 at university 22 in industryDid plenty of project management business analysis and data modeling some bigprojects tooLast 8 years focused on data modeling and integrationLast 6 years in paricular focused on semantic data and semantic integration

I love to poke in other peoples data and get in-depth So theres a lot about data inthese slidesSee you can sort by type and keyword full abstracts are available

Ive provided a few references below but if a topic interests you please search inthe publications

The shorter version has about 110 slides so sit back relax and enjoy the ride Shouldtake us 120h

Ask questions at any time in the chat Ill answer them all at the endThis longer version has 130 slides including info about Library metadata andontologies

My publications

11 GLAM VS INTERNETGLAM CH DH

Cultural Heritage (CH) the sum of our non-economic heritageObvious implications to economically signi cant sectors eg tourismSome say its the source of all creativity would you agreeIncludes old and new (eg digitally-born) material and immaterial tangible andintangible permanent and temporal (eg interactive installations)

Galleries Libraries Archives Museums (GLAM) sisterhood of institutions that carefor our CH each with its own perspective and prioritiesDigital Humanities (DH) the use of computers in the humanities

Eg some UK universities with DH programs KingsDH UCLDH DH_OUCamDigHum

12 GOOGLE NGRAMS PHRASES IN BOOKSSearch for library museum vs Google Facebook Twitter in books the web sites arenegligible

13 GOOGLE NGRAMS TWO SPECIFIC ORGSCompare two speci c orgs Facebook is more popular in recent books compared toBritish Museum over time

14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum

15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom

Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium

To survive GLAMs must adopt the internet as their default modus operandi

Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches

16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions

2 GLAM CONTENT STANDARDSGLAM data is complex and varied

Exception is the ruleMany metadata format variationsData comes from a variety of systems

Thus professional organizations have found it useful to de ne content standards

Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form

Examples are extremely useful for data modelers to decide how to map the data

21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects

211 CCO EXAMPLE ARTWORK AND CREATOR RECORD

212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 2: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

TABLE OF CONTENTS1 Intro

11 GLAM vs Internet12 Google NGrams Phrases in Books13 Google NGrams Two Speci c Orgs14 Google Trends Search Popularity15 How To Survive in the Internet Age16 Why Linked Open Data (LOD) is Important

2 GLAM Content Standards21 Museum Content Standards22 Archival Content Standards23 Library Content Standards

3 GLAM Metadata Schemas31 Seeing Standards (2)32 XML Schemas33 Museum Metadata CDWA34 Archive Metadata35 Library Metadata MARC

4 GLAM Ontologies41 Europeana Data Model42 CIDOC CRM43 Web Annotation (Open Annotation OA)44 International Image Interop Framework (IIIF)45 Library Ontologies46 Archival Ontologies

5 GLAM LOD Datasets (LODLAM)51 Wikidata52 VIAF53 Global Authority Control

6 LODLAM Projects

1 INTROA bit about me co-founder of Sirma Group Holding Bulgarias largest software groupand parent company of Ontotext

30y in IT 8 at university 22 in industryDid plenty of project management business analysis and data modeling some bigprojects tooLast 8 years focused on data modeling and integrationLast 6 years in paricular focused on semantic data and semantic integration

I love to poke in other peoples data and get in-depth So theres a lot about data inthese slidesSee you can sort by type and keyword full abstracts are available

Ive provided a few references below but if a topic interests you please search inthe publications

The shorter version has about 110 slides so sit back relax and enjoy the ride Shouldtake us 120h

Ask questions at any time in the chat Ill answer them all at the endThis longer version has 130 slides including info about Library metadata andontologies

My publications

11 GLAM VS INTERNETGLAM CH DH

Cultural Heritage (CH) the sum of our non-economic heritageObvious implications to economically signi cant sectors eg tourismSome say its the source of all creativity would you agreeIncludes old and new (eg digitally-born) material and immaterial tangible andintangible permanent and temporal (eg interactive installations)

Galleries Libraries Archives Museums (GLAM) sisterhood of institutions that carefor our CH each with its own perspective and prioritiesDigital Humanities (DH) the use of computers in the humanities

Eg some UK universities with DH programs KingsDH UCLDH DH_OUCamDigHum

12 GOOGLE NGRAMS PHRASES IN BOOKSSearch for library museum vs Google Facebook Twitter in books the web sites arenegligible

13 GOOGLE NGRAMS TWO SPECIFIC ORGSCompare two speci c orgs Facebook is more popular in recent books compared toBritish Museum over time

14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum

15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom

Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium

To survive GLAMs must adopt the internet as their default modus operandi

Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches

16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions

2 GLAM CONTENT STANDARDSGLAM data is complex and varied

Exception is the ruleMany metadata format variationsData comes from a variety of systems

Thus professional organizations have found it useful to de ne content standards

Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form

Examples are extremely useful for data modelers to decide how to map the data

21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects

211 CCO EXAMPLE ARTWORK AND CREATOR RECORD

212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 3: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

1 INTROA bit about me co-founder of Sirma Group Holding Bulgarias largest software groupand parent company of Ontotext

30y in IT 8 at university 22 in industryDid plenty of project management business analysis and data modeling some bigprojects tooLast 8 years focused on data modeling and integrationLast 6 years in paricular focused on semantic data and semantic integration

I love to poke in other peoples data and get in-depth So theres a lot about data inthese slidesSee you can sort by type and keyword full abstracts are available

Ive provided a few references below but if a topic interests you please search inthe publications

The shorter version has about 110 slides so sit back relax and enjoy the ride Shouldtake us 120h

Ask questions at any time in the chat Ill answer them all at the endThis longer version has 130 slides including info about Library metadata andontologies

My publications

11 GLAM VS INTERNETGLAM CH DH

Cultural Heritage (CH) the sum of our non-economic heritageObvious implications to economically signi cant sectors eg tourismSome say its the source of all creativity would you agreeIncludes old and new (eg digitally-born) material and immaterial tangible andintangible permanent and temporal (eg interactive installations)

Galleries Libraries Archives Museums (GLAM) sisterhood of institutions that carefor our CH each with its own perspective and prioritiesDigital Humanities (DH) the use of computers in the humanities

Eg some UK universities with DH programs KingsDH UCLDH DH_OUCamDigHum

12 GOOGLE NGRAMS PHRASES IN BOOKSSearch for library museum vs Google Facebook Twitter in books the web sites arenegligible

13 GOOGLE NGRAMS TWO SPECIFIC ORGSCompare two speci c orgs Facebook is more popular in recent books compared toBritish Museum over time

14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum

15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom

Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium

To survive GLAMs must adopt the internet as their default modus operandi

Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches

16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions

2 GLAM CONTENT STANDARDSGLAM data is complex and varied

Exception is the ruleMany metadata format variationsData comes from a variety of systems

Thus professional organizations have found it useful to de ne content standards

Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form

Examples are extremely useful for data modelers to decide how to map the data

21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects

211 CCO EXAMPLE ARTWORK AND CREATOR RECORD

212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 4: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

11 GLAM VS INTERNETGLAM CH DH

Cultural Heritage (CH) the sum of our non-economic heritageObvious implications to economically signi cant sectors eg tourismSome say its the source of all creativity would you agreeIncludes old and new (eg digitally-born) material and immaterial tangible andintangible permanent and temporal (eg interactive installations)

Galleries Libraries Archives Museums (GLAM) sisterhood of institutions that carefor our CH each with its own perspective and prioritiesDigital Humanities (DH) the use of computers in the humanities

Eg some UK universities with DH programs KingsDH UCLDH DH_OUCamDigHum

12 GOOGLE NGRAMS PHRASES IN BOOKSSearch for library museum vs Google Facebook Twitter in books the web sites arenegligible

13 GOOGLE NGRAMS TWO SPECIFIC ORGSCompare two speci c orgs Facebook is more popular in recent books compared toBritish Museum over time

14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum

15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom

Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium

To survive GLAMs must adopt the internet as their default modus operandi

Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches

16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions

2 GLAM CONTENT STANDARDSGLAM data is complex and varied

Exception is the ruleMany metadata format variationsData comes from a variety of systems

Thus professional organizations have found it useful to de ne content standards

Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form

Examples are extremely useful for data modelers to decide how to map the data

21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects

211 CCO EXAMPLE ARTWORK AND CREATOR RECORD

212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 5: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

12 GOOGLE NGRAMS PHRASES IN BOOKSSearch for library museum vs Google Facebook Twitter in books the web sites arenegligible

13 GOOGLE NGRAMS TWO SPECIFIC ORGSCompare two speci c orgs Facebook is more popular in recent books compared toBritish Museum over time

14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum

15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom

Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium

To survive GLAMs must adopt the internet as their default modus operandi

Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches

16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions

2 GLAM CONTENT STANDARDSGLAM data is complex and varied

Exception is the ruleMany metadata format variationsData comes from a variety of systems

Thus professional organizations have found it useful to de ne content standards

Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form

Examples are extremely useful for data modelers to decide how to map the data

21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects

211 CCO EXAMPLE ARTWORK AND CREATOR RECORD

212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 6: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

13 GOOGLE NGRAMS TWO SPECIFIC ORGSCompare two speci c orgs Facebook is more popular in recent books compared toBritish Museum over time

14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum

15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom

Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium

To survive GLAMs must adopt the internet as their default modus operandi

Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches

16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions

2 GLAM CONTENT STANDARDSGLAM data is complex and varied

Exception is the ruleMany metadata format variationsData comes from a variety of systems

Thus professional organizations have found it useful to de ne content standards

Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form

Examples are extremely useful for data modelers to decide how to map the data

21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects

211 CCO EXAMPLE ARTWORK AND CREATOR RECORD

212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 7: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum

15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom

Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium

To survive GLAMs must adopt the internet as their default modus operandi

Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches

16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions

2 GLAM CONTENT STANDARDSGLAM data is complex and varied

Exception is the ruleMany metadata format variationsData comes from a variety of systems

Thus professional organizations have found it useful to de ne content standards

Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form

Examples are extremely useful for data modelers to decide how to map the data

21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects

211 CCO EXAMPLE ARTWORK AND CREATOR RECORD

212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 8: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom

Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium

To survive GLAMs must adopt the internet as their default modus operandi

Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches

16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions

2 GLAM CONTENT STANDARDSGLAM data is complex and varied

Exception is the ruleMany metadata format variationsData comes from a variety of systems

Thus professional organizations have found it useful to de ne content standards

Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form

Examples are extremely useful for data modelers to decide how to map the data

21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects

211 CCO EXAMPLE ARTWORK AND CREATOR RECORD

212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 9: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions

2 GLAM CONTENT STANDARDSGLAM data is complex and varied

Exception is the ruleMany metadata format variationsData comes from a variety of systems

Thus professional organizations have found it useful to de ne content standards

Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form

Examples are extremely useful for data modelers to decide how to map the data

21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects

211 CCO EXAMPLE ARTWORK AND CREATOR RECORD

212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 10: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

2 GLAM CONTENT STANDARDSGLAM data is complex and varied

Exception is the ruleMany metadata format variationsData comes from a variety of systems

Thus professional organizations have found it useful to de ne content standards

Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form

Examples are extremely useful for data modelers to decide how to map the data

21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects

211 CCO EXAMPLE ARTWORK AND CREATOR RECORD

212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 11: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects

211 CCO EXAMPLE ARTWORK AND CREATOR RECORD

212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 12: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

211 CCO EXAMPLE ARTWORK AND CREATOR RECORD

212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 13: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 14: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

213 CCO EXAMPLE CREATOR EXTENT

How to describe one aspect of the data

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 15: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

214 SPECTRUM

UK Museum Collections Management Standard

De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 16: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

215 SPECTRUM EXAMPLE OBJECT ENTRY

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 17: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions

Image by DPitti 2015

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 18: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)

Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to

Data sharingGlobal availability of resourcesSharing the cataloging burden

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 19: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

231 FRBR FRSAD FRAD

Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 20: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

232 FRBR

Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)

Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 21: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

233 FRSAD

Anything can be subject (thema) referred to by various namestitles (nomen)

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 22: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

234 FRBR-LRM

FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 23: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

3 GLAM METADATA SCHEMASHow many of the standards listed in

apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)

Seeing Standards A Visualization of the MetadataUniverse

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 24: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

31 SEEING STANDARDS (2)

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 25: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

32 XML SCHEMASDo you deal with XML I bet you do

XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)

Tools

patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)

RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)

httpsgithubcomEHRIjing-trangtreeEHRI-176

httpsgithubcomVladimirAlexievrnc

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 26: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 27: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

331 CDWA LITE

XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 28: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

332 CONA SCHEMA

Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 29: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

333 SPECTRUM XML

has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 30: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

334 LIDO

Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 31: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

335 LIDO SCHEMA

Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 32: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 33: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

341 ARCHIVE METADATA PROBLEMS

Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events

EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)

bioghist

ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 34: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 35: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

351 MARC MUST DIE

A whole emotional subculture based on a slogan by Roy Fielding 2002

MARC is dead (is it really) in-depth discussion wiki

marc-must-dieinfoFutureLibFacebook group

by Sally Chambers ELAG 2011Presentation

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 36: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering

RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 37: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on

OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 38: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

411 EDM SEMANTIC GRAPH

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 39: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

412 EDM ISSUESCONSIDERATIONS

Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object

EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs

Many providers use the minimal features and make mistakes Europeana didnt do alot of validation

Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana

formed to push this strategic point (2015-2020)Europeana Data Quality Committee

Evolving speci cation (since 2009)

Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 40: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

42 CIDOC CRM comprehensive reference model used for history historic events

archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual

CIDOC CRM

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 41: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

421 CIDOC CRM PROPERTIES

Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 42: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

422 CIDOC GRAPHICAL EXAMPLES

(or including Kindle) (or including Kindle) essential to

understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails

Video Tutorial HTML versionGraphical Representation continuous HTML version

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 43: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image

and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)

W3C TR

Complete Example

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 44: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 45: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

441 IIIF PRESENTATION API

Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 46: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies

BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat

soundly criticized

httpbibschemaorg

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 47: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

451 RDAREGISTRY

Resource Description and Access (RDA) Registry info is well organized

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 48: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

452 RDAREGISTRY PROPERTIES

Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 49: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

453 A TASTE OF FRBROO

Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le

TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)

EDM variant

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 50: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

4531 A TASTE OF FRBROO

Simpler FRBRoo variant

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 51: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

4532 A TASTE OF FRBROO

More complex FRBRoo variant

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 52: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

454 FRBR-INSPIRED

FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 53: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

455 BRITISH LIBRARY DATA MODEL

Pragmatic data model that reuses several ontologies and adds own props

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 54: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

456 FIRST LIBRARY THAT RUNS ON RDF

Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk

httpdatadeichmannomarc2rdf

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 55: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 56: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good

Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies

Records in Context (RiC) new upcoming semantic standard by ICA

Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM

(2015) 10 (Sep 2016) Document key components of archival description

properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities

Progress report Mlist for commentsConceptual Model

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 57: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

461 RIC SAMPLE NETWORK

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 58: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 59: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 60: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

511 WIKIDATA GENEALOGY

Family tree of Barack Obama

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 61: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

512 SUM OF ALL PAINTINGS

Data used forWikidata Project Sum of All Paintings

Works by painter across collections (catalogue raisonneacute) Eg Frans Hals

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 62: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

513 CROTOS

Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 63: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

514 YOU CAN HELP TOO

(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers

US Getty Museum

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 64: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

515 LETS FIX THE SECOND ONE

like thisFind it on Gettys site add the info

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 65: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

516 HISTROPEDIA

Timelines of everyting Eg paintings by Leonardo

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 66: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 67: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

521 VIAF VS WIKIDATA (2015)

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 68: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions

The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing

Authority Addicts The New Frontier of Authority Control on Wikidata

Wikidata Project Authority ControlName Data Sources for Semantic Enrichment

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 69: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

531 NAMES OF LUCAS CRANACH

in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 70: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

532 WIKIDATA COREFERENCING CAN ENLARGE VIAF

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 71: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

533 MIX-N-MATCH

A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc

Mix-n-Match

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 72: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

5331 YOU CAN HELP WITH AUTHORITIES TOO

Eg checking matches to Getty AAT Single sign-on a click per item Easy

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 73: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg

Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project

Research functions and sometimes integrated into Virtual Research Environments

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 74: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including

CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 75: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 76: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

621 RESEARCHSPACE SEARCH

Powerful and precise search Drawings by Rembrandt that are about Mammals

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 77: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS

First implementation experience of the CIDOC CRM Fundamental Relations approach

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 78: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 79: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

624 RESEARCHSPACE SEARCH IMPLEMENTATION

120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 80: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION

(Not Ontotext work) (DOldman)Watch the video

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 81: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

626 RESEARCHSPACE DATA ANNOTATION

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 82: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

627 RESEARCHSPACE DATA ANNOTATION MODEL

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 83: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

628 IMAGE ANNOTATION

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 84: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

629 IMAGE ANNOTATION MODEL

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 85: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6210 IMAGE ANNOTATION ARCHITECTURE

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 86: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)

very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 87: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 88: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 89: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

651 EUROPEANA STATISTICS

Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 90: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector

EFD Semantic App

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 91: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

661 TASTY BULGARIAN RECIPES

Eg 150 with beer including pancakes

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 92: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

662 WIDE GEOGRAPHIC COVERAGE

Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 93: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

663 EFD ENRICHMENT FD GAZETTEER

Use Wikipedia Categories to extract a FD Gazetteer

Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 94: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

664 EFD ENRICHMENT PRUNING FD CATEGORY TREE

Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 95: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

665 EFD ENRICHMENT FRENCH

Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 96: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

666 EFD PLACE ENRICHMENT

We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 97: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

667 EFD PLACE ENRICHMENT

Hierarchical semantic facet based on Geonames

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 98: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

668 EFD GEOGRAPHIC MAPPING CLUSTERING

Once we have places its relatively easy to map them We used the Cluster Mapperlibrary

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 99: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

669 EFD GEOGRAPHIC MAPPING JITTERING

There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 100: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6610 GLAMS WORKING WITH WIKIDATA

Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience

Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms

easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition

Europeana Wikimedia Taskforce report

GLAMs Working with Wikidata

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 101: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 102: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

671 GVP LOD RELEASES

Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 103: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

672 ONTOTEXT SCOPE OF WORK

Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers

Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer

httpvocabgettyeduontologyISO 25964 ontology

httpvocabgettyedusparqlhttpvocabgettyedudoc

On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 104: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

673 COMPLETE REPRESENTATION OF ALL GVP INFO

See VAlexiev CIDOC 2014External Ontologies

GVP LOD Ontologies and Semantic Representation

Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 105: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

674 GVP SEMANTIC REPRESENTATION (1)

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 106: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

675 GVP SEMANTIC REPRESENTATION (2)

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 107: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

676 KEY VALUES (FLAGS) ARE IMPORTANT

Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 108: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

677 ASSOCIATIVE RELATIONS ARE VALUABLE

More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 109: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 110: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

679 COMPREHENSIVE DOCUMENTATION

Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 111: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6710 SAMPLE QUERIES (100) INTEGRATED UI

Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 112: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6711 GVP VOCABS USAGE

Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now

AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc

Cataloging Calculator

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 113: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6712 AAT IN EUROPEANA

typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 114: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 115: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

681 JPGETTY MUSEUM AND WIKIDATA

Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 116: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical

mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative

Work ongoing at eg see Eg possible mapping of (sculpture) Cast after

httpsgithubcomamerican-art NPG mapping issues

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 117: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations

In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks

Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 118: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6101 EHRI PERSON NETWORKS

Research question how person networks in uenced chance of survival Idea

Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 119: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6102 EHRI LARGE-SCALE PLACE MATCHING

Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 120: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6103 EHRI ORAL HISTORY INTERVIEWS

Analyze 25k OH Interviews

ONTO Place enrichment Person name recognitionINRIA word2vec experiments

guard Cos dist punishment Cos dist

guard Cos dist punishment Cos dist

guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123

semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 121: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS

And referencing to Geonames so we can get coordinates

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 122: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper

Art History networks from Wikipedia through VIAF idTime and nationality from ULAN

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 123: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 124: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD

Ethan Gruber

Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 125: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6131 COINS IN TIME AND SPACE

Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 126: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6132 GEOGRAPHIC DISTRIBUTION

Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 127: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6133 NUMISHARE

Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War

Art ofDevastation

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 128: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6134 NOMISMA

Shared authorities for numismatics Eg a mint

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 129: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6135 COINHOARDS

Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)

CoinHoardsorgnomismaorg

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 130: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6136 STATISTICAL CHARTS

Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 131: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6137 KERAMEIKOS POTTERY LOD

editor Based on XForms leverages Getty and BM LODKerameikos Project

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki

Page 132: Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage

6138 EADITOR AND XEAC

Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki